The field of deep learning requires heavy amounts of storage. Machine learning datasets often reach into the 100s of GBs, and pre-trained model weights can be large too. Many datasets (like JFT-300M by Google) remain inaccessible to data scientists outside large organizations. Furthermore, open datasets can be scattered across many apps and websites, and require the user to follow a lengthy tutorial for download, setup and processing.
The mission of Algovera is to empower data scientists to work independently and keep ownership of what they create. We are passionate about creating AI apps that are oriented to the needs and preferences of the user rather than the corporation. For this reason, we are very excited to announce our partnership with DataUnion to improve the performance of face anonymization algorithms across different ages, ethnicities, and genders. In particular, this project will explore a novel approach to creating AI apps that are more fair, unbiased, and collectively owned by users and creators, using the advantages of networks, crowdsourcing and Web3. You can can find information on this new approach and why we think it’s important below. We are currently looking for freelance data scientists to work with us on this project, and you can submit your application using this form.
Recently, we announced the launch of Algovera Grants to fund projects that combine AI and Web3.
The mission of Algovera is to empower data scientists to work independently outside of centralised tech companies. We think this is preferable to the current status quo for two main reasons.
HuggingFace is an online community of data scientists with a mission of making it as easy as possible to to train, optimize, and deploy models. HuggingFace Hub aims to provide a central place for collecting models, datasets and metrics. The model hub offers thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. HuggingFace Spaces provides a simple way for data scientists and organisations to demonstrate machine learning apps. It provides access to cloud compute and accelerated deployment.
Many artists in Web3 use generative machine learning models to create digital art to sell as NFTs. This often involves re-training a model on a dataset collected by the artist, and publishing the output images on an NFT marketplace. Experimenting with newly-collated datasets and training procedures is time-consuming and expensive, meaning that artists are highly protective of these assets. However, recent private AI and decentralized marketplace (e.g. Ocean Protocol) technologies may enable artists to monetize their datasets and models while maintaining control and privacy. This has the added advantage of unlocking more value for artists and NFT enthusiasts through tokens, liquidity pools and staking.
The community of AI startups in Ireland is one that we’re very close with, from our time of working on our own startup developing machine learning (ML) algorithms for motion analysis in physiotherapy applications. During our experience in the space, we have spoken to and developed relationships with a large network of other startups that are developing ML and computer vision technology. Like our own previous venture, these startups follow Web2 practices. In a previous blog post, the benefits that Web3 technologies can offer in the development of AI algorithms for our use case were explored, along with our development efforts in this direction.