This thesis proposes new architectures for deep neural networks with attention enhancement and multilinear algebra methods to increase their performance. We also explore graph convolutions and their particularities. We focus here on the problems related to real-time pose estimation. Pose estimation is a challenging problem in computer vision with many real ap- plications in areas including augmented reality, virtual reality, computer animation, and 3D scene reconstruction. Usually, the problem to be addres- sed involves estimating the 2D and 3D human pose, i.e., the anatomical keypoints or body “parts” of persons in images or videos. Several papers propose approaches to achieve high accuracy using architectures based on conventional convolution neural networks; however, mistakes caused by oc- clusion and motion blur are not uncommon, and those models are com- putationally very intensive for real-time applications. We explore different architectures to improve processing time, and, as a result, we propose two novel neural network models for 2D and 3D pose estimation. We also in- troduce a new architecture for Graph attention networks called Semantic Graph Attention.