Submitted by AttentionImaginary54 t3_102mf6v in MachineLearning
marcus_hk t1_j2xqe50 wrote
>Are there other recent deep learning based alternatives?
Transformers seem best suited to forming associations among discrete elements. That's what self-attention is, after all. Where transformers perform well over very long ranges (in audio generation for example) there is typically heavy use of Fourier transforms and CNNs as "feature extractors", and the transformer does not process raw data directly.
The S4 model linked above treats time-series data, not as discrete samples, but as continuous signal. Consequently it works much better.
Viewing a single comment thread. View all comments