Wednesday, December 25, 2024

Datasaur allows you to automatically build a model based on a set of labels

Share

Long before people started talking about ChatGPT and generative AI, companies like Datasaurus covered the basics of building machine learning models, helping to label elements to train the model. With the development of artificial intelligence, these types of capabilities have become even more critical.

To enable more companies to build models without a data science professional, Datasaur announced the ability to build a model directly from label data, making model building accessible to a much less tech-savvy audience. It also announced a $4 million contract extension that closed last December.

Founder Ivan Lee says the recent surge in interest in artificial intelligence has been great for the company and actually fits well with the startup’s strategy. “Datasaur has always strived to be the best place to collect training data to feed into your models, whether it’s LLM models or traditional NER models or sentiment analysis or whatever,” Lee told TechCrunch.

“We are the best interface for non-technical users to come in and tag this data,” he said.

The growth of LLM programs is helping to raise awareness in general of how AI can lend a hand in a business context, but he says most companies are still in the exploratory stage and still need products like Datasaur to build models. Lee says one of his goals from the beginning has been to democratize AI, particularly in natural language processing, and the modern model building feature should make AI accessible to more companies, even those without specialized expertise.

“This feature is particularly exciting to me because it allows teams without data scientists and engineers to simply tag and label that data however they see fit, and it will just automatically train the model for them,” Lee said.

Lee sees this as a way to expand beyond the initial target market of data scientists. “Now we’re going to open it up to construction companies, law firms, marketing companies that may not have a background in data engineering but can still build NLP models [based on their training data]”

He says he has been able to limit the amount of venture investment he makes – the previous amount in 2020 was a modest $3.9 million – because he pursues a frugal approach. His engineering team is mainly based in Indonesia, and although he expects to be hired, he takes pride in running the company effectively.

“My philosophy has always been profitability, growing in a scalable way, never growth at any cost,” Lee said. This means that it takes into account every hire and its impact on the company.

With a remote, cross-cultural workforce, employees can learn from each other, which inherently brings diversity to the company. “There is a significant difference in workplace culture between the US and the way things work in Indonesia. So one thing is that we had to consciously capture the best of both worlds,” he said. This may mean encouraging Indonesian colleagues to speak up or contradict what the manager says, something they don’t like to do for cultural reasons. “We were very active in encouraging this,” he said.

But he says American workers can learn a lot about operating in Asia, such as respect for colleagues and a culture of putting the team first, so he had to lend a hand teams overcome these cultural differences.

The $4 million investment was led by Individualized Capital with participation from HNVR, Gold House Ventures and TenOneTen. In total, the company raised $7.9 million.

Latest Posts

More News