About

I do research on very long sequence processing with neural networks. By “very long” I mean tens and potentially hundreds of thousands of tokens, enabling the sequence processing models to work with chapters or even whole articles—in contrary to the current state-of-the-art limitation of few paragraphs. Other goals of mine include smaller neural networks, able to reach state-of-the-art performance with a lesser amount of compute, and theoretical explanation of sequence processing models.