Name: metrics
Owner: Libraries.io
Description: :chart_with_upwards_trend: What to measure, how to measure it.
Created: 2017-07-06 09:52:58.0
Updated: 2018-05-19 16:27:33.0
Pushed: 2018-03-29 09:39:34.0
Size: 56
Language: null
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This repo aims to gather a diverse group of organisations, institutions and individuals to explore how best to measure, classify and otherwise infer infromation from software ecosystems and projects.
A place to refrence the significant works of others tackling these issues in industry, academia or as individuals.
For me this process begins with a number of framing questions. These questions are user-centred and based on the needs of our users, as defined in our personas from there we can define measures and metrics.
Measures can be direct (a quoted metric i.e. 'released on 1st Jan 2017), derivative (information gleaned from data i.e. 'released more than a year ago') or aggregated (compiled from >1 data i.e. 'all releases we're within the last year').
Measures may be broken down into a number of areas. At Libraries.io we have been considering areas for code, community, distribution and documentation. The single, overarching measure in Libraries.io is SourceRank which is defined over in our documentation.
Data can be 'harvested' (gathered automatically using APIs, data dumps, scraping etc) or 'farmed' (gathered from a community by contribution).
Libraries.io currently focusses on harvesting metrics. It is preferable for a metric to be present in many sources rather than a single source so that we can make like-for-like comparisons across supported package managers.
Sources contain metrics. They may also contain measures themselves. We think it is important not to rely on proprietary, third party services for measures.
Reproducibility is the key issue here. An inability to reproduce a measure from the source data (metrics) risks the ability to create a like for like comparison of two pieces of software and ties all users of the classifier to that service. This is unacceptable(?).
While there will be no one single approach that is right for everyone our hope is that we can come to consensus about what good measures look like and what metrics they will require, so that Libraries.io can provide them.
Please feel free to fork and PR additions to any of the documents in this repo, to share your thoughts and ideas in an issue, to propose a draft specifications for areas, measures etc. as a PR or reference.