Task

Given a product containing $m$ tokens $P={p_1, p_2, ..., p_m}$ and a corresponding review sentence containing $n$ tokens $R={r_1, r_2, ..., r_n}$, they are combined into a single sentence $S=\text{`` This is a review of \textit{P}: \textit{R} ''}$. The Subject-Object-Category-Preference (SOCP) Quadruple Extraction task aims to first identify whether $S$ is a comparative sentence, and (if so) then extract a set of comparative quadruples in $S$:

$$ \begin{equation} {\cal{S}}_{SOCP} = {..., (sub, obj, cc, cp)_i, ...}, \end{equation} $$

where $sub$ denotes the subject entity, corresponding to $P$; $obj$ represents the object entity being compared with $sub$; $cc \in \cal{C}$ denotes the comparative category, referring to the category of the aspect being compared between $sub$ and $obj$, where $\cal{C}$ is a predefined set of categories; $cp \in {\text{BETTER, WORSE, EQUAL}}$ denotes the comparative preference, indicating whether $sub$ is better than, worse than, or equal to $obj$.

DataSet

The SOCP Quadruple Extraction task aims to extract quadruple comprising subject, object, category, and preference. We have built the SOCP-Phone dataset for the SOCP task. This dataset is collected from the JD platform, consisting of mobile phone product reviews posted between November 1, 2021, and January 15, 2024. The statistical information for SOCP-Phone is shown in the table below:

		Train	Dev	Test	Total
#Categories		61	38	38	61
#Sentences	#Comparative	1680	210	210	2100
	#Non-Comparative	1680	210	210	2100
	#Multi-Comparative	559	79	69	707
	Total	3360	420	420	4200
#Elements	Subject	1680	210	210	2100
	Object	1857	230	225	2309
	Category	2475	323	309	3108
	Preference	1876	241	235	2352
#Quadruples	BETTER	1963	265	257	2485
	WORSE	400	24	37	482
	EQUAL	200	44	24	248
	Total	2563	333	318	3215
#Quadruples/#Comparative		1.53	1.59	1.51	1.53

\#Categories represents the number of aspect categories. \#Sentences indicates the total number of sentences annotated in the dataset, where \#Comparative, \#Non-comparative and \#Multi-comparative refer to the number of comparative sentences, non-comparative sentences and comparative sentences with multiple comparisons, respectively. \#Elements denotes the total number of annotations for comparative elements (Subject, Object, Category, and Preference). \#Quadruples denotes the number of comparative quadruples constructed by combining comparative elements, and is statistically counted based on their comparative preference (Better, Worse, or Equal). \#Quadruples/\#Comparative indicates the average number of quadruples per comparative sentence.

Here is a sample data instance:

{
    "_id": "0a171a04782299f20620dfe744aaabb9",

    "creationTime": "2023-10-19 17:16:21",

    "phoneName": "Apple iPhone 15",

    "phoneBrand": "Apple",

    "comment": "美滋滋，超便宜买到的才发布没多久的苹果15，体验了灵动岛，手感超级棒，比我的13流畅一些，可能是内存多的缘故，颜色很漂亮，很喜欢，感谢百亿补贴",

    "quadList": [

                    {

                        "subject": "Apple iPhone 15",

                        "object": "AppleiPhone 13",

                        "preference": "更好",

                        "gold_category": "OS#PERFORMANCE"

                    }

                ]

}

The 50 samples of Phone-SOCP are provided in "data/sample_50.json". The full dataset will be released after acceptance.

Annotation

The annotation details are described in Annotation.md. We first introduce the construction of the category system, followed by the annotation schema with illustrative examples. Finally, we present the automatic annotation approach using large language models, along with the prompt design strategy.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
Code		Code
data		data
Annotation.md		Annotation.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Task

DataSet

Annotation

About

Uh oh!

Releases

Packages

Languages

yuuume/DCRA-ICL

Folders and files

Latest commit

History

Repository files navigation

Task

DataSet

Annotation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages