Recently the ForkAES construction was proposed by Andreeva et al. for efficiently performing authenticated encryption of very short messages on next generation IoT devices. The ForkAES tweakable block cipher uses around one and a half AES encryption calls to produce a pair of ciphertexts for any given plaintext. However the only downside of the construction is that it needs to store an extra state of 128 bits in addition with the storage elements required to perform AES encryption. Thus a hardware implementation of ForkAES would require additional circuit area to accommodate the extra state. In this paper, we first show that it is possible to implement ForkAES without any additional storage elements other than those required to implement AES, if the AES circuit can additionally perform decryption. Such an implementation naturally requires more clock cycles to perform ForkAES operations. We extend the recently proposed Atomic AES v2.0 architecture to realize ForkAES and compare the area-latency trade-offs incurred with and without an additional storage. The area of the most compact ForkAES design takes about 1.2 times that of AES. In the second part of the paper we look at another important parameter of lightweight efficiency, i.e. energy. It is well known that round based constructions for AES are the most energy efficient ones. We extend the so-called S3K2 construction of Banik et al. (IEEE HOST 17) to realize ForkAES in an energy-preserving manner, and compare the effects of some design choices. The energy consumption of our best ForkAES design takes about 2 times that of AES. From lightweight design perspective, our results hence demonstrate that although ForkAES lives up to its promise (of being roughly 1.5 times that of AES) in terms of its area, the same does not hold for its energy consumption.