A learning-based robotics approach that models sequences of robot actions using transformer architectures. Instead of predicting individual low-level commands, ACT groups actions into higher-level chunks, enabling more stable and long-horizon control. This technique is particularly useful for manipulation and sequential tasks where temporal coherence matters. ACT bridges perception, planning, and control by learning from demonstrations or collected trajectories. While inspired by transformer models popular in language processing, ACT is fundamentally applied to robotic action spaces. It reduces compounding errors common in step-by-step prediction. ACT represents a shift toward more structured and scalable robot control policies.