Submitted by zalivom1s t3_11da7sq in singularity
OutOfBananaException t1_jadwqrn wrote
Reply to comment by Olivebuddiesforlife in Leaked: $466B conglomerate Tencent has a team building a ChatGPT rival platform by zalivom1s
What kind of data do you mean? I don't believe they have a high quantity of quality domestic text training data, and they have stated they don't want to use worldwide data. It's not clear how they plan to resolve this.
Olivebuddiesforlife t1_jaf19dw wrote
First, Chinese sample set is 1.4B and they have been training their AI, enterprise level - with cameras, image recognition and processing. There are huge farms of people, entire industries which are AI model’s human partners since 2017.
Second, the language model can work with the WeChat data, which is a lot and lot of person to person interaction, as opposed to Western data which does not include that, but just general public interactions. Even considering private, everything being consolidated on a single platform means a lot.
Third, TikTok data - one of the largest social media with large data sets, including language, culture and stuff.
So - guess this adds the quality. And they don’t want to expand to the west which places it in the understandable category.
There have been low level chat bots in China, and also they’ve thus far focused on enterprise and public (read government) use. They’re venturing into private, ig
Viewing a single comment thread. View all comments