Reverse Engineering UEFI Firmware
by Jethro Beekman – Mar 8, 2015 . Filed under: English, Technology, UEFI, Reverse Engineering.
In order to figure out how my BIOS drive password worked, I had to reverse-engineer the firmware that comes with my laptop. You can find the binary blobs on the update CD that Lenovo provides, and it turns out these blobs are actually UEFI images. UEFI firmware is made up of many different loadable modules (drivers, shared libraries, etc.), which are stored in the Portable Executable (PE) image format. These modules can be extracted from the image using Nikolaj Schlej’s excellent UEFIExtract (from UEFITool). Once you have all the PE modules, the real reversing can begin.
It helps to understand how UEFI works. The Internet contains a wealth of information, and here are two articles to get you started: Getting started with UEFI development and UEFI Programming - First Steps. The main problem that makes reverse engineering hard is that while the firmware consists of over 300 loadable modules, there is no dynamic linker. Instead, the entry point of a module gets passed an pointer to a “protocol” registry. A protocol is basically an interface, or in other words a struct of function pointers. The registry is keyed by Globally unique identifiers (GUIDs). To call into another module, you need to lookup a GUID in the registry and then call some function returned in the interface.
My first strategy to get some insight into the firmware was to collect GUIDs from images and build a dependency graph. This turned out to be useless. The UEFI image contains PEI dependency sections for each image, but the GUIDs that are listed seem to have no relation to actually required protocols. Furthermore, identifying GUIDs (also known as 16 random bytes) in binaries is hard, and even when I manged to identify a section that seemed to store GUIDs, there would be many GUIDs in such a section that were never referenced from code in that image.
To figure out the dependencies, I decided to actually run the modules and see
which protocols they lookup and which ones they register. Wait what, run UEFI
PE modules? Yes, I wrote a tool called
efiperun
that
can load PE modules into memory and simulate enough of what an UEFI environment
is supposed to look like to actually run them. Most modules will upon entry
lookup some standard protocols, do some initialization, and register one or
more protocols that other modules can use.
With this information in hand, you can do more targeted reversing, trying to
identify interfaces and function signatures. For example,
LenovoTranslateService.efi
installs a protocol
e3abb023-b8b1-4696-98e1-8eedc3d3c63d
. This protocol turns out to have the
following interface:
struct interface_e3abb023_b8b1_4696_98e1_8eedc3d3c63d { void(EFIAPI *translate)(void* _this, const char* input, char* output, size_t length); }
With efiperun
you can actually write code that calls into loaded EFI modules,
which makes it easy to test installed interfaces. Utilizing this functionality,
I was able to determine that the translate
function above actually translates
an ASCII string to keyboard scan codes.
When doing reverse engineering, you always end up exploring branches that turn out to be less fruitful. But the knowledge obtained exploring such a branch can be useful in exploring other ideas. Now that I’ve setup the stage with the tools I’m going to use, I will describe the path that lead to the discovery of the algorithm. Keep in mind that this is a reconstruction and the order in which I actually figured parts out is different.
Graphical entry point
The Lenovo firmware does not make heavy use of graphical elements, but the Hard Drive Password prompt actually does display a small pictogram, pictured on the right. Now, judging by the filenames, there are only a few modules that deal with graphics:
SystemGraphicsConsoleDxe.efi
SystemHiiImageDisplayDxe.efi
SystemImageDecoderDxe.efi
SystemImageDisplayDxe.efi
All these modules install a single protocol that don’t use a well-known
GUID,
so let’s see what modules call them. As it turns out, only
SystemSplashDxe.efi
calls SystemHiiImageDisplayDxe.efi
(96ce4c12-55e4-4a1c-bbf3-73a5055fb364) and only LenovoPromptService.efi
calls
SystemImageDisplayDxe.efi
(71583a77-2789-4213-a83b-eef42afe85e0).
SystemSplashDxe.efi
pretty much seems to be as advertised and even contains a
GIF file with the ThinkPad splash image. Upon further inspection,
LenovoPromptService.efi
contains 21 BMP files, all related to displaying
password prompts. Bingo!
Password control program
The Prompt service installs a single protocol
56350810-2cb2-4aa0-96d2-66d1b8e1aac2 which is only called by
LenovoPasswordCp.efi
. This module contains key code connecting various
password-related modules, and I’ll assume Cp means “control progam”. Besides
the prompt service (for text input), it also calls into
LenovoSoundService.efi
(e01fc710-ba41-493b-a919-53583368f6d9, for beeping
noises when you press an invalid key), LenovoTranslateService.efi
(described
above) and LenovoCryptService.efi
(73e47354-b0c5-4e00-a714-9d0d5a4fdbfd,
supposedly a crypto module—see next section).
The password control program has an interesting function at offset 0x8cc
that
calls only SetMem
, CopyMem
and the Crypto and Translate services. Here’s
roughly the code for this function:
void _0x8cc(const CHAR16 in[64], UINT8 out[16]) { UINT8 ascii[64], scancode[64], hash[32]; BootServices->SetMem(out,16,0); BootServices->SetMem(ascii,64,0); BootServices->SetMem(scancode,64,0); BootServices->SetMem(hash,32,0); for (int i=0;i<64;i++) { ascii[i]=in[i]; } if (TranslateService) { TranslateService->Translate(TranslateService,ascii,scancode,64); if (CryptService) { CryptService->SHA256(CryptService,scancode,64,hash); BootServices->CopyMem(out,hash,16); } BootServices->SetMem(ascii,64,0); BootServices->SetMem(scancode,64,0); BootServices->SetMem(hash,32,0); } else { BootServices->SetMem(ascii,64,0); } }
I’ll assume that this function is used to hash a password input by the user.
There’s another interesting function at offset 0xa30
, which checks whether the
input CHAR16
is in the character class [0-9A-Za-z ;]
, which is used to limit
the possible characters in the password input.
I’ve made good progress identifying part of the path from password input to security unlock command, but here I’ve hit a dead end. It’s not really clear from where the password control program gets called and what happens to the hash it outputs. I’ll try a different approach next, but first let’s talk about the crypto service.
Crypto service
The password control program calls a function in the Crypto service at offset
0x26e0
, which references three GUIDs that I hadn’t seen before:
- 69188a5f-6bbd-46c7-9c16-55f194befcdf
- d0b3d668-16cf-4feb-95f5-1ca3693cfe56
- 6c48f74a-b4df-461f-80c4-5cae8a85b7ee
These GUIDs do not appear in any efiperun
output. Instead, I just searched
all images for appearances of these GUIDs, and they appear in 10 other images.
A noteworthy appearance is in SystemCryptSvcRt.efi
at offset 0x1c70
. Offset
0x1c70
is referenced at offset 0x330
, where it is immediately followed by
the unicode string “SHA256”. This is followed by a jump table at offset
0x370
, which points to 3 jumps at offset 0x33c0
that jump to 3 functions at
offsets 0x753c
, 0x7570
and 0x760c
. The function at offset 0x753c
references offset 0x2258
, which stores the hash initialization constants
for SHA256! The rest of the
SystemCryptSvcRt.efi
module also contains SHA256 round constants, and similar
strings and constants for other algorithms.
All in all this suggests that the Crypto service is a front for the
cryptographic routines in SystemCryptSvcRt.efi
and that the password control
program calls SHA256. I wrote a small test program for the EFI shell that
tests this:
void buf2hexstr(VOID*buf,CHAR16*str,UINTN len) { UINTN i; static CHAR16 hchars[16]={'0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'}; UINT8* buf_=(UINT8*)buf; for (i=0;i<len;i++) { *(str++)=hchars[*(buf_) >>4]; *(str++)=hchars[*(buf_++)&0xf]; } } EFI_STATUS Initialize(...) { ... EFI_GUID guid={0x73e47354,0xb0c5,0x4e00,{0xa7,0x14,0x9d,0x0d,0x5a,0x4f,0xdb,0xfd}}; void* intf; if (SystemTable->BootServices->LocateProtocol(&guid,NULL,&intf)==EFI_SUCCESS) { const char* in="TEST"; char out[32]={}; CHAR16 str[13+(32*2)+2]=L"SHA256 test: "; ((void(*)(void*,const char*,UINTN,char*))(*(void**)intf))(intf,in,4,out); buf2hexstr(out,str+13,32); str[13+(32*2)]='\n'; SystemTable->ConOut->OutputString(SystemTable->ConOut, str); SystemTable->ConOut->OutputString(SystemTable->ConOut, L"Expected: 94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2\n"); } else { SystemTable->ConOut->OutputString(SystemTable->ConOut, L"Unable to load CryptService protocol\n"); } ... }
Outputs:
SHA256 test: 94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2
Expected: 94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2
Success!
Hard-drive communication
As mentioned, I discovered how the input password got hashed, but it still needs to be sent to the drive. The UEFI standard defines the ATA Pass Thru Protocol, which can be used to send raw ATA commands to a drive. This protocol is very likely to be used for sending ATA security commands. This protocol is not loaded upon initialization by any modules, but the GUID does appear in the following modules:
FdiskOem.efi
LenovoHdpManagerDxe.efi
LenovoMfgBenchEventDxe.efi
SystemAhciAtaAtapiPassThruDxe.efi
SystemAhciBusDxe.efi
SystemAhciBusSmm.efi
SystemIdeAtaPassThruDxe.efi
SystemIdeBusDxe.efi
Wait a minute, is that second module called Lenovo Hard Drive Password Manager? Why yes, it is. There’s a bunch of code in this module, but I found an interesting function call chain for you:
- offset
0xce0
- offset
0x8a0
- CryptService.SHA256
- offset
0x144c
- offset
0x232c
- EFI_ATA_PASS_THRU_PROTOCOL.PassThru
- offset
- offset
The input to the SHA256 function is a parameter to the function at offset
0xce0
, and data from an EFI runtime variable “LenovoHddSecInfoVar”. The
PassThru function is called with a ATA_OP_SECURITY_UNLOCK command block
including the hash generated just before. I assume the input to the function at
offset 0xce0
is the password hash from the password control program, but what
is the data in “LenovoHddSecInfoVar”? The dmpstore
utility in the EFI shell
that will dump runtime variables. Here’s mine:
Variable BS '2D8FBE63-3A04-4EF8-A8A4-77321DB5A9AB:LenovoHddSecInfoVar' DataSize = 8
00000000: 98 7D BC B7 00 00 00 00- *........*
From the code I know that the value is being used as a memory address, so let’s
use the mem
utility to dump that:
B7BC7D98: .. .. .. .. .. .. .. ..-18 E0 0F B8 00 00 00 00 ........*
B7BC7DA8: 98 DF 0F B8 00 00 00 00-.. .. .. .. .. .. .. .. *........
Those are two more memory addresses, let’s see what’s there:
B80FDF98: 61 53 73 6D 6E 75 20 67-53 53 20 44 34 38 20 30 *aSsmnu gSS D48 0*
B80FDFA8: 56 45 20 4F 30 35 47 30-20 42 20 20 20 20 20 20 *VE O05G0 B *
B80FDFB8: 20 20 20 20 20 20 20 20-.. .. .. .. .. .. .. .. *
B80FE018: 31 53 48 44 53 4E 46 41-30 42 38 35 39 34 20 45 *1SHDSNFA0B8594 E*
B80FE028: 20 20 20 20 .. .. .. ..-.. .. .. .. .. .. .. .. *
If you squint your eyes just right, those kind of read Samsung SSD 840 EVO
500GB
and S1DHNSAFB05849E
, the Model Number and Serial Number for my SSD,
respectively. Piecing all this together, you get the algorithm described in my
other blog post.
Conclusion
As I mentioned, this story is the abridged version of how I found the password hashing algorithm. In reality, I looked at many other modules, including many hours spent looking at useless things. In the end though, I prevailed and found what I was looking for, developing a bunch of tools in the process:
- efiperun:
- Load and run EFI PE image files in a regular OS environment.
- guiddb:
- Scan files for GUIDs and output them in C-source file format.
- memdmp:
- Dump UEFI memory using EFI shell.
- tree:
- A Ruby abstraction for a firmware tree on your filesystem previously extracted by UEFIExtract.
I hope these tools are of use to anyone. Patches welcome. ☺️