2024-11-15

Dippy SN11’s path to bring TAO to 10M+ users

2024-11-15

In my last blog post we discussed how Dippy’s Subnet 11 on Bittensor aims to create an open-source alternative to Character AI’s closed source roleplay LLM. Character AI is the world's most used AI product after ChatGPT/Gemini and gets 20,000 queries per second, which is ~20% of Google Search's volume. Unfortunately, Character's model is totally closed sourced, and they raised $100M+ to train it from scratch. Subnet 11 aims to create an open-source alternative to Character AI’s closed roleplay model.

Subnet 11 was inspired by the public good “Subnet 9 - Pre-Training”, where miners compete to create world-class productivity LLMs. Similarly, on SN11 miners compete to create roleplay LLMs.

You can chat with this cute cat therapist named Guru in the Dippy App!

‍

We will initially showcase the power of SN11’s open-source model by having it power the Dippy app, which currently serves 280K+ monthly active users. Following this, we hope the Bittensor model gets widespread adoption amongst indie devs building AI companion apps. The AI companion category is the fastest growing product segment in consumer AI and is expected to generate $70-$150 billion in annualized revenue by 2030.

SN11 hopes to do for roleplay LLMs what Mistral did for productivity LLMs and Stability AI did for image generation.

‍

How will the Dippy app bring the power of $TAO to millions? Does your app even have that much usage?

Yes! Dippy has a rapidly growing userbase of 280K+ monthly active users across iOS and Android.

Screenshot from our analytics dashboard on Sept 3rd, 2024!

‍

‍

In July, Dippy ranked #1 in Top New Free Apps on Google’s Play Store in Germany, and #2 in Italy. Dippy’s landing page alone got 200K+ page views in July (Source: SimilarWeb). We expect usage on Dippy to rapidly increase as we launch our web app in October.

Dippy topped the Play Store charts in Germany and Italy recently! (Source: Data.ai)

‍

That’s cool! How are the models produced by your subnet currently performing?

The miner submitted models produced by our subnet are 7-8B in parameter size and are now performing better than Llama 3 8B and GPT 3.5 Turbo on EQBench! This is a benchmark used to measure the emotional intelligence of LLMs, which is critical for roleplay.

Our tiny 7B miner models perform better than the massive GPT-3.5 Turbo on EQBench!

‍

We recently achieved a breakthrough in model quality after we started using a continuously generated synthetic dataset to evaluate miner submitted models. This dataset innovation allowed us to conquer the over-fitting issue plaguing many of the model creation subnets in Bittensor.

The synthetic dataset has 100K+ conversations and is constantly growing. SN11 started off evaluating models by using a dataset called PIPPA (created using the industry leading product in roleplay Character AI). However, we quickly learned that miners started overfitting so we also created our own synthetic dataset using several larger SOTA open-source roleplay models like Magnum 72B, Euryale 70B alongside closed source SOTA models like OpenAI and Claude.

Interestingly, SN11 actually doesn’t use EQBench as a direct scoring metric to assign weights! Yet, the above graph shows performing well on SN11’s roleplay benchmark, indirectly correlated with performance on EQBench! SN11 has had to create our own benchmark for evaluating Roleplay LLMs, because it’s a brand new research field with almost no peer-reviewed benchmarks.

All of this is cool, but can I try these miner submitted models you keep talking about? How about downloading and using them myself?

Yes, you can! We recently released our front-end, which we will constantly be updating with the best miner submitted models : https://bittensor.dippy.ai/play

Try out the miner submitted models yourself, right here!

‍

These miner models are currently just 7B, but we expect them to get significantly better as we encourage submission of bigger parameter models. We also have a leaderboard, so you can find and download the top models: https://huggingface.co/spaces/DippyAI/SN11-DippyRoleplay

How do your Bittensor models get better? When will millions of Dippy users start using it?

Our current evaluation criteria has clearly proven to be extremely competent at roleplay, as evidenced by the miner submitted models outperforming much bigger models like GPT 3.5 on EQBench.

In early November, we plan to add this top performing miner submitted 7-13B model into the Dippy app. This allows us to benchmark user preference compared to the current, in-house 40B+ parameter roleplay model that powers the app.

Following this, we will allow miners to finetune or train these state of the art 7B-13B models on a wide variety of datasets, each of which has a spike in a particular dimension. For example, some models may excel in therapy, while others would be good at comedy, and so on. We hypothesize that these domain specific fine-tuned models give us a solid foundation to scale the roleplay capabilities to the next level.

Then, the Dippy team will open-source a model router that can choose which of these models, with specific strengths, to choose based on the user's input. It is indeed very similar to the Mixture of Experts architecture, wherein an “expert” that has a spike in one dimension is chosen to produce the output. This routing model will route the queries to the top miner submitted models in each dimension relevant for roleplay — therapy, comedy, romance, and so on — and provide a world class experience for the users.

At this point, SN11 will run a new competition for miners, where they will build on top of the open-source model router we have made internally. They’ll also iterate on the routing model while optimizing it based on real user data, such as mean conversation length, # of message retries, # of messages edited, liked messages, and much more.

‍
We think this is a clear path towards building a SOTA Roleplay model through Bittensor while serving millions, if not billions of people across the globe! At this point, we hope the open-sourced SN11 Bittensor model gets widespread adoption amongst indie devs building AI companion apps. The AI companion category is the fastest growing product segment in consumer AI and is expected to generate $70-$150 billion in annualized revenue by 2030.

Ok, I’m sold! Bittensor and Dippy Subnet 11 sounds cool! How do I get involved?

A great place to start would be the Bittensor Dev Docs and the Bittensor Discord Server. Each channel in the discord server is operated by a subnet owner and in the “pinned” section of each subnet channel you will find detailed info on how to get started as a miner. If you want to work on building great roleplay models, check out channel 11!

‍TaoStats is the best website to visualize the whole ecosystem. Bittensor offers a great opportunity to contribute to the open-source AI movement and my hope with this blog is to demystify the protocol for many!

‍