First impressions of datapkg, how to proceed

F

Update 2011-30-09: I did not noticed the wonderful CKAN wiki explaining several aspects I was looking for.

As I previously announced, I will write a GUI for the datapkg tool during these 10 weeks of stage at FBK.
I am beginning today to study its source code. It is very well written – a couple of hacks here and there, but is highly object-oriented and makes heavy uses of design patterns – but lacks of internal documentation.
The datapkg command line interface makes use of the whole architecture that, even with a great separation of concerns – seems not to be designed as a library. Even if it is a small program, it is quite complex.

For example, as I understand, the simple search by name of a data-set involves the creation of a Command object (from the command line) that creates a Specification object, which in turn creates an Index object. It could continue from there, I stopped looking. It is well organised but difficult to be understood by an external developer.

I am not criticizing datapkg code negatively. Indeed I am impressed with the quality of the code organization and the beauty of its object-oriented organization. Being it not written to be used as a library, it lacks of some explanations of the architecture and design. An (internal) API is also missing.

Therefore, I think that I will write a tiny simple library to help me hide the complexity of the program. Before that, I strongly believe that some Software Engineering practices are needed to help me to perform my job. In the next days I will perform some reverse engineering practices and draw an initial architecture of datapkg. Then, I will dram some diagrams to help me understand which components are called.

Coding can wait.

About the author

dgraziotin

Dr. Daniel Graziotin is a senior researcher (Akademischer Rat) at the University of Stuttgart, Germany. His research interests include human, behavioral, and psychological aspects of empirical software engineering, studies of science, and open science. He is associate editor at the Journal of Open Research Software and academic editor at the Research Ideas and Outcomes (RIO) journal. Daniel was awarded an Alexander von Humboldt Fellowship for postdoctoral researchers in 2017, the European Design Award (bronze) in 2016, and the Data Journalism Award in 2015. He received his Ph.D. in computer science at the Free University of Bozen-Bolzano, Italy.

Add comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

By dgraziotin

About Author

dgraziotin

Dr. Daniel Graziotin is a senior researcher (Akademischer Rat) at the University of Stuttgart, Germany. His research interests include human, behavioral, and psychological aspects of empirical software engineering, studies of science, and open science. He is associate editor at the Journal of Open Research Software and academic editor at the Research Ideas and Outcomes (RIO) journal. Daniel was awarded an Alexander von Humboldt Fellowship for postdoctoral researchers in 2017, the European Design Award (bronze) in 2016, and the Data Journalism Award in 2015. He received his Ph.D. in computer science at the Free University of Bozen-Bolzano, Italy.