Submitted by sonudofsilence t3_y19m36 in deeplearning

I would like to take the word embeddings of a text and visualize them at the same plot (for understanding reasons). The question is how i should pass the text into the pretrained BERT model? At first, i separated the text on sentences and passed each one separetely, but im not sure if this had the right results.

2

Comments

You must log in or register to comment.

neuralbeans t1_irw2mza wrote

If you're talking about the contextual embeddings that BERT is known for then those change depending on the sentence used, so you need to supply the full sentence.

2

sonudofsilence OP t1_irw4bmr wrote

Yes, that's why i want to pass "all the text" into bert, because for example a word in a sentence has to have similar vector with the same word (with same meaning) in another sentence. How can i accomplish that, as the max tokens number of bert is 512?

1

neuralbeans t1_irw4jiw wrote

You're supposed to pass in each sentence separately, as a list of sentences. You do not pass all the sentences as one string.

1