Authors:
Ismael Caballero
1
;
Laure Berti-Equille
2
and
Mario Piattini
1
Affiliations:
1
University of Castilla-La Mancha, Spain
;
2
Qatar Computing Research Institute, Qatar
Keyword(s):
Data Scientist, Maturity Model, Data Governance, Data Management, Data Quality Management.
Related
Ontology
Subjects/Areas/Topics:
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Enterprise Resource Planning
;
Enterprise Software Technologies
;
Simulation and Modeling
;
Simulation Tools and Platforms
;
Software Engineering
Abstract:
With the Unstoppable Advance of Big Data, the Role of Data Scientist Is Becoming More Important than Ever before, in This Position Paper, We Argue That Scientists Should Be Able to Acknowledge the Importance of Data Quality Management in Data Science and Rely on a Principled Methodology When Performing Tasks Related to Data Management, in Order to Quantify How Much a Data Scientist Is Able to Perform the Core of Data Management Activities We Propose the Personal Data Science Process (PdsP), Which Includes Five Staged Qualifications for Data Science Professionals, the Qualifications Are based on Two Dimensions: Personal Data Management Maturity (PDMM) and Personal Data Science Performance (PDSPf), the First One Is Defined According to Dgmr, a Data Management Maturity Model, Which Include Processes Related to the Areas of Data Management: Data Governance, Data Management, and Data Quality Management, the Second One, PDSPf, Is Grounded on PSP (Personal Software Process) and Cover the Pe
rsonal Skills and Knowledge of Data Scientist When Participating in a Data Science Project, These Dimensions Will Allow to Developing a Measure of How Well a Data Scientist Can Contribute to the Success of the Organization in Terms of Performance and Skills Appraisal.
(More)