Open Source AI Definition – weekly update Feb 23
A weekly summary of interesting threads on the forum.
Is the definition of “AI system” by the OECD too broad?
Central question: Do we need to define what AI systems are?
- No, defining AI systems is not important to a definition of open source AI, it might even prove problematic in its application
Training data access
Central question: for a model to be open source, do we need “open” access to its training data?
- Yes, to be able to have “freedom to modify”, we must know what data the model was trained with.
- No, training data and the trained model are two different assets and we don’t necessarily need access to original data to modify and specialise
- BUT, are we asking the wrong question? Maybe the root issue is not strictly being able to copy the petabytes of the original training data of LLMs but rather the lack of high quality datasets available to train and fine tune models. Is this a concern that OSI should address in a definition?
Recognising Open Source “Components” of an AI System
Central question: Should the definition of Open Source AI take a gradient approach (such as the case with RAIL licence), judging the openness of the components of a model, rather than the whole of it? How do we balance between becoming a definition too restrictive?
- Yes, we must consider the openness of components to make sure that the definition will remain relevant and applicable.
- No, a definition should serve as a standard, one supported by different stakeholders. It should be practical in industry, academia and policymaking alike. Therefore, it must be an either/or approach.
It is worth highlighting, that it is the intention of OSI to have a definition which is:
- binary, a “system” is either Open Source AI or is not; and
- applicable and useful. That’s why we’re seeking a wide endorsements for the release candidate and the 1.0 version. This is something that is frequently mentioned in the town halls.
Also worth noting
- Results from Pythia and Llama2 working groups are out!
- Watch the recordings of the fourth town hall meeting on Defining Open Source AI and the accompanying slides.
Likes
Reposts