So English is tokenized most efficiently because the tokenizer was mostly trained on English text, so tokenizing words of this language efficiently was most important - do I understand that correctly?
ShrimpsIsBugs
joined 2 years ago
So English is tokenized most efficiently because the tokenizer was mostly trained on English text, so tokenizing words of this language efficiently was most important - do I understand that correctly?
I also think, we'll have ads at some point - and that's perfectly fine and understandable as long as these ads aren't too many and aren't too intrusive. My hope is that because of lemmy's federated nature a healthy competition will emerge. So whenever an instance starts overloading the users with ads, users will just move to another instance with less adds at the blink of an eye.