The paper presents the theory and the numerics of a thermodynamically consistent formulation of gradient plasticity at small strains. Starting from the classical local continuum formulation, which fails to produce physically meaningful and numerically converging results within localization computations, a thermodynamically motivated gradient plasticity formulation is envisioned. The model is based on an assumption for the Helmholtz free energy incorporating the gradient of the internal history variable, a yield condition and the postulate of maximum dissipation resulting in an associated structure. As a result the driving force conjugated to the hardening evolution is identified as the quasi-non-local drag stress which incorporates besides the strictly local drag stress essentially the divergence of a vectorial hardening flux. At the numerical side, besides the balance of linear momentum, the algorithmic consistency condition has to be solved in weak form. Thereby, the crucial issue is the determination of the active constraints exhibiting plastic loading which is solved by an active set search algorithm borrowed from convex non-linear programming. Moreover, different discretization techniques are proposed in order to compare the FE-performance in local plasticity with the advocated gradient formulation both for hardening and softening. Copyright © 2001 John Wiley & Sons, Ltd.