Draft of Open Data License for Consultation

VISWAM.AI , Swecha and SFLC.in launches draft open data license for public consultation. The license is designed to safeguard community led dataset used by AI developers. The aim is to create a license that will ensure that any person using a data set for learning or other purposes will not make it proprietary and will contribute back to the community.

The need for a new licensing framework emerged as community initiatives for publishing datasets for public use have gained traction, including Swecha’s pioneering work in building Telugu speech-to-text datasets. In the absence of purpose-built licenses, such community-generated data risks being appropriated by large organisations without adequate attribution or reciprocal contribution.

The purpose of this new license is to establish a framework that will require any entity utilizing a dataset for machine learning or related purposes to abstain from asserting proprietary rights over it and to contribute any modifications or derivations back to the community. The Provider of the Original dataset must be appropriately credited, and any subsequent or amended dataset must be released under the same license. This framework is grounded in the principles of copyleft licenses applicable to software.

At its core, the initiative is about strengthening community data commons and supporting the creation of community-sourced datasets while ensuring that proprietary models cannot extract value without attribution or benefitting the community back.

The draft license for consultation can be accessed here.

We invite all stakeholders to use the link to submit public comments on the draft License consultation. Alternatively you can also email your feedback to mail@sflc.in.