Abstract
Pattern mining, that is, the automated discovery of patterns from data, is a mathematically complex and computationally demanding problem that is generally not manageable by humans. In this article, we focus on small datasets and study whether it is possible to mine patterns with the help of the crowd by means of a set of controlled experiments on a common crowdsourcing platform. We specifically concentrate on mining model patterns from a dataset of real mashup models taken from Yahoo! Pipes and cover the entire pattern mining process, including pattern identification and quality assessment. The results of our experiments show that a sensible design of crowdsourcing tasks indeed may enable the crowd to identify patterns from small datasets (40 models). The results, however, also show that the design of tasks for the assessment of the quality of patterns to decide which patterns to retain for further processing and use is much harder (our experiments fail to elicit assessments from the crowd that are similar to those by an expert). The problem is relevant in general to model-driven development (e.g., UML, business processes, scientific workflows), in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices, organizational conventions, or technical choices, that modelers can benefit from when designing their own models.
Original language | English |
---|---|
Article number | 17 |
Journal | ACM Transactions on Internet Technology |
Volume | 16 |
Issue number | 3 |
DOIs | |
Publication status | Published - 1 Jun 2016 |
Externally published | Yes |
Fingerprint
Keywords
- Algorithms
- Experimentation
- Human factors
ASJC Scopus subject areas
- Computer Networks and Communications
Cite this
Mining and quality assessment of mashup model patterns with the crowd : A feasibility study. / Rodríguez, Carlos; Daniel, Florian; Casati, Fabio.
In: ACM Transactions on Internet Technology, Vol. 16, No. 3, 17, 01.06.2016.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - Mining and quality assessment of mashup model patterns with the crowd
T2 - A feasibility study
AU - Rodríguez, Carlos
AU - Daniel, Florian
AU - Casati, Fabio
PY - 2016/6/1
Y1 - 2016/6/1
N2 - Pattern mining, that is, the automated discovery of patterns from data, is a mathematically complex and computationally demanding problem that is generally not manageable by humans. In this article, we focus on small datasets and study whether it is possible to mine patterns with the help of the crowd by means of a set of controlled experiments on a common crowdsourcing platform. We specifically concentrate on mining model patterns from a dataset of real mashup models taken from Yahoo! Pipes and cover the entire pattern mining process, including pattern identification and quality assessment. The results of our experiments show that a sensible design of crowdsourcing tasks indeed may enable the crowd to identify patterns from small datasets (40 models). The results, however, also show that the design of tasks for the assessment of the quality of patterns to decide which patterns to retain for further processing and use is much harder (our experiments fail to elicit assessments from the crowd that are similar to those by an expert). The problem is relevant in general to model-driven development (e.g., UML, business processes, scientific workflows), in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices, organizational conventions, or technical choices, that modelers can benefit from when designing their own models.
AB - Pattern mining, that is, the automated discovery of patterns from data, is a mathematically complex and computationally demanding problem that is generally not manageable by humans. In this article, we focus on small datasets and study whether it is possible to mine patterns with the help of the crowd by means of a set of controlled experiments on a common crowdsourcing platform. We specifically concentrate on mining model patterns from a dataset of real mashup models taken from Yahoo! Pipes and cover the entire pattern mining process, including pattern identification and quality assessment. The results of our experiments show that a sensible design of crowdsourcing tasks indeed may enable the crowd to identify patterns from small datasets (40 models). The results, however, also show that the design of tasks for the assessment of the quality of patterns to decide which patterns to retain for further processing and use is much harder (our experiments fail to elicit assessments from the crowd that are similar to those by an expert). The problem is relevant in general to model-driven development (e.g., UML, business processes, scientific workflows), in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices, organizational conventions, or technical choices, that modelers can benefit from when designing their own models.
KW - Algorithms
KW - Experimentation
KW - Human factors
UR - http://www.scopus.com/inward/record.url?scp=84978081289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978081289&partnerID=8YFLogxK
U2 - 10.1145/2903138
DO - 10.1145/2903138
M3 - Article
AN - SCOPUS:84978081289
VL - 16
JO - ACM Transactions on Internet Technology
JF - ACM Transactions on Internet Technology
SN - 1533-5399
IS - 3
M1 - 17
ER -