Submitted by super_deap t3_11tmpc5 in MachineLearning
mike94025 t1_jcn7ksu wrote
Reply to comment by royalemate357 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
Works for all. You need a compiler backend that can code-gen for your target, and need a frontend for the optimizer that can process the IR.
Alternatively, you need a backend for Triton (or another already supported optimizer) that can codegen for your target architecture.
royalemate357 t1_jcnjaeo wrote
oh cool, thanks for the clarification. Nice that you folk made it more backend independent. Would be interesting to try it out on amd/mps devices, i wonder if those requirements are met on those devices though.
mike94025 t1_jcv7ltl wrote
You might look into https://github.com/pytorch/pytorch/pull/95793.
Viewing a single comment thread. View all comments