windbg | The Mad Virtualizer at Microsoft

On Debugging Virtual Applications: Part 3 – Situations where a Debugger is most needed for Virtual Applications

February 23, 2016 madvirtualizer 1 comment

In parts 1 and 2 of this series, I’ve covered some of the basic fundamentals of the concept of debugging compiled software code which, for the most part, has been a black box for many working with virtual applications. In part 3, I would like to now move the discussion towards those situations that warrant further debugging analysis. Note that for this topic, I will be limiting the scope to specifically user mode applications as application virtualization products like App-V are pretty much limited to these.

Native Code vs. Managed Code

Managed Code refers to computer source code that will only executive under the management of a special runtime module. Technically managed code refers to any code that requires a specific run-time but even native code may be dependent on underlying libraries contained within middleware (i.e. VC runtimes, java.) Microsoft coined the term “Managed Code” to describe code that requires the .NET Common Language Runtime (CLR.) It is written in .NET languages such as C# or Visual Basic .NET. Managed code’s memory, reference counting, and garbage collection is done by the underlying run-time – hence the term managed.

Native code refers to the classic Windows code that has been written in C, C++, Visual basic, and others in rare cases. While native code can easily be debugged through classic debuggers, managed code requires additional extensions. Native code is compiled to work directly with the underlying operating system while managed code is precompiled and then processed by the JIT (Just-in-time) .NET compiler and converted to native code at run-time allowing for greater portability. Native code was actually compiled for the operating system and hardware architecture. Porting this code to other architectures requires re-compiling.

Situation 1: Application Crashes

When an application crashes, it stops – immediately. Data is lost. It’s dead. It usually happens because the application has an encountered an unexpected exception and the potential for the exception has not been anticipated therefore the exception is not able to be properly handled within the application. Because of this, a special program within the operating system must deal with the aftermath. How the crash is intercepted and handled will depend on specific configurations within the operating system. How much data (active memory snapshots) that are actually saved for diagnostics also varies on how things are configured. Whether the module is native code or managed code will also potentially affect how it is handled.

Native Code Unexpected ErrorsExceptions

Depending on the operating system, when a normal application crashes, you will be interrupted with a dialog telling you that the application has stopped working and will be closed. Older users of windows may recognize the older 16-bit Windows blue screen that stated this. In recent versions of Windows (Windows Vista) and later WER (Windows Error Reporting) kicks in and cleans up the virtual address space and makes a record of the information even collecting a basic mini-dump. You will see a message like the one below.

WER is the modern version of an older program which would collect basic diagnostic data and a mini-dump. This program was called Dr. Watson and has been around since the dawn of Windows. WER takes this a step further and can report the crash to Microsoft. This is especially helpful if the program crashing is a Microsoft application. Other ISV’s often incorporate JIT’s (just-in-Time) debuggers to collect information for their diagnostic purposes as well.

Developers, Support Technicians, and Diagnostic Engineers will often register an alternative JIT debugger to intercept an application crash. Often times this is so they can do live debugging, or substitute their preferred collection tool for collecting user mode dump files for further debugging. For example, if you have either Windows Debugger (WINDBG) installed or Visual Studio, you will see this instead:

Note that in the above example, the debugger is able to intercept at the exception point (just in time) so there could even be further live/step-through debugging done.

The Windows Error Reporting model (https://msdn.microsoft.com/en-us/library/windows/desktop/bb513641(v=vs.85).aspx) is really a newer generation of what was formerly Dr. Waston in that it integrates with Microsoft for diagnostic and troubleshooting telemetry data called Software Quality Management (or SQM) data. WER defaults to collect SQM data and minidumps, but can easily be managed centrally, especially through GPOs. By default, minimal data is collected in the realm of user mode dumps and the dump files go into a special queue for upload to microsoft or potential another central repository.

You can configure WER to generate bigger dumps that can be more useful when debugging application crashes. You must provide some additional configuration to WER. Since I usually use this as a temporary measure (where I cannot install additional tools) I usually remove these once I have the data that I need.

First, create a folder where the dumps will be stored out of band from the WER queue (i.e. C:dumps.) Then import the below text as a .reg file:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps]

"DumpFolder"=hex(2):63,00,3a,00,5c,00,64,00,75,00,6d,00,70,00,73,00,00,00

"DumpCount"=dword:0000000a

"DumpType"=dword:00000002

Alternatively, you could also batch this out as well:

mkdir C:Dumps

reg add "HKLMSOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps" /f

reg add "HKLMSOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps" /v DumpType /t REG_DWORD /d 2 /f

reg add "HKLMSOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps" /v DumpCount /t REG_DWORD /d 100 /f

reg add "HKLMSOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps" /v DumpFolder /t REG_EXPAND_SZ /d C:Dumps /f

Double-click the .reg file to import it into the registry. If you wish to change the path where dumps are stored, you can edit the following key to reflect the preferred location:

HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumpsDumpFolder

Even when I use the above, I still have trouble collecting good dumps when dealing with virtual application crashes. Not to mention, in my case I often have Visual Studio and the Windows Debugger installed together. I prefer (and strongly) recommend that you use the Sysinternals tool PROCUMP (https://technet.microsoft.com/en-us/sysinternals/dd996900.aspx) to intercept application crashes to collect dumps for post-mortem debugging.

One great feature of ProcDump is that you can also configure it to override WER (and other debuggers) as the default debugger by registering it with the JIT Application Exception Debugger key (AEDebug) and also control where the user mode dumps are stored. Once that happens, you will then see ProcDump intercept and collect dumps by default when application crashes occur.

Managed Code Unhandled Exceptions

For .NET applications that encounter unhandled exceptions within managed code, you get a different experience. The runtime incorporates elements for this very type of thing giving the user a different option than what they would normally get with WER or the alternative registered with AEDebug.

Exception Handling

Since we have discussed the fact that many application crashes come from unhandled exceptions – often access violations or what it used to referred to – GP faults – let’s discuss actual exception handling and why it is important. Exception handling is literally what it means, a method through code of anticipating and controlling how these exceptions will be handled through special processes or something relatively easy to track (often through interface messages and/or logging.) One of the many reasons we have WER (and historically Dr. Watson) in Windows was to ensure that the operating system would have a last resort method of handling these issues. Every exception cannot be caught with programming and often, the usefulness of how it is handled will vary from developer to developer. 🙂

In early Windows, primarily the early days of Win32, one of the touted features was the capability of being able to leverage SHE (Structured Exception handling) in the Win32 API. I first read and learned about SEH again, from master Matt Pietrick back in the NT 4 days reading MSJ (the earlier version of MSDN magazine. To my excitement and amazement, the article is STLL available online and worth a read to this day (https://www.microsoft.com/msj/0197/Exception/Exception.aspx.) Exception handling leverages trap and try/catch statements in most programming languages. Here is an implementation in its simplest form using native code (C++)

__try

{

*lpstr = '';

}

__except (GetExceptionCode() == EXCEPTION_ACCESS_VIOLATION ?

EXCEPTION_EXECUTE_HANDLER :

EXCEPTION_CONTINUE_SEARCH)

{

MessageBox(NULL,"EXCEPTION_ACCESS_VIOLATION","CRAPPYAPP.EXE",MB_OK);

g.fHandledViolation = FALSE;

}

This exception would yield the following message but still allow the program to continue.

How exception levels differ

Whether the exception comes in the form of a popup window or is just simply logged into a file or to ETW, it is up to the programmer to ensure this. It is also up to the developer as to the level of information that will be collected. To use a managed code example, the following exception handler will sufficiently catch the exception:

try

{

}

catch (Exception e)

{

Console.WriteLine("ERROR: Unexpected error.");

Console.WriteLine("DETAILS: {0}", e.Message);

return 3;

}

But the example below would serve even better as it would give more diagnostic information for troubleshooting:

try

{

}

catch (Exception e)

{

Console.WriteLine("ERROR: Unexpected error.");

Console.WriteLine("EXECPTION TYPE: {0}", e.GetType().Name);

Console.WriteLine("DETAILS: {0}", e.Message);

Console.WriteLine("STACKTRACE: {0}", e.StackTrace);

return 3;

}

This will log more detailed information including a trace of the current thread stack. Not sure what “Stack Trace” means? Do not worry, that will be covered in a later post.

Situation 2: Hangs

When an application hangs, you can break into the process using a debugger or you can use a debugging utility to save the existing data in memory into a dump file for post-mortem analysis. Windows has a similar mechanism for user mode hangs as it does for application crashes using WER. With WER, you are often given options to either wait, close the program and destroy the hung memory space, or capture a small dump and upload it to Microsoft to see if it is possibly a known issue with even perhaps a known solution.

In the case of virtual application, the data collected can often be unreliable so in place of using WER, I will use WinDBG or ProcDump to collect a hang dump for further analysis.

Situation 3: Heap Corruption

Another type of issue that can happen often with virtual applications is Heap Corruption. The Process heap is a special section of usable memory that is initially created when a process is created. The size is determined the linking portion of the compiler. Subsequent heaps may be created allocated, deallocated, and destroyed throughout the life of a process. With managed code, the heap is managed by the runtime (.NET CLR) to allocate objects and to provide its memory services like for example the Garbage Collector.

Heap corruption usually occurs when a process allocates a block of heap memory of a given size and then a thread writes to and/or frees memory addresses beyond the requested size of the heap block. Heap corruption can also occur when a process writes to block of memory that has already been freed. Sometimes it is pretty obvious and the debugger is able to quickly assess it

0xc0000374 – A heap has been corrupted.

Usually during a heap API operation on the thread stack:

77b888c8 77b15c49 ntdll!RtlpFreeHeap+0x59c49

77b888cc 77abb4c8 ntdll!RtlFreeHeap+0x268

77b888d0 76eecb60 kernelbase!GlobalFree+0xc0

77b888d4 7796cd18 kernel32!GlobalFreeStub+0x28

77b888d8 000a1871 shttyapp!CorruptDatHeap+0xd1

77b888dc 000a2525 shttyapp!WndProc+0x345

77b888e0 777d84f3 user32!_InternalCallWinProc+0x2b

77b888e4 777b6c40 user32!UserCallWinProcCheckWow+0x1f0

77b888e8 777b6541 user32!DispatchMessageWorker+0x231

77b888ec 777d6f30 user32!DispatchMessageA+0x10

77b888f0 000a1242 shttyapp!WinMain+0x152

77b888f4 000a2c19 shttyapp!__tmainCRTStartup+0x11a

77b888f8 779638f4 kernel32!BaseThreadInitThunk+0x24

77b888fc 77ae5e13 ntdll!__RtlUserThreadStart+0x2f

77b88900 77ae5dde ntdll!_RtlUserThreadStart+0x1b

Most of the time, you have to incorporate more specific tools to track it down. Speaking of that:

Next up: more on the tools!

Categories: Uncategorized Tags: AeDebug, app-v, AppV, debug, Debugging, dotnet, JIT, procdump, wer, Win32, windbg

On Debugging Virtual Applications: Part 2: Types and Modes

February 21, 2016 madvirtualizer 3 comments

Productive virtual application debugging requires an understanding of the basic fundamentals of debugging compiled software code. For this part of my series on debugging virtual applications, I will be focusing exclusively on these fundamentals. If you are already familiar with these concepts, please allow me to quickly recap these to those readers which may be either not familiar, or only somewhat and looking to solidify these concepts.

Types of Debugging

There are several categories of debugging and the descriptions will vary by vendor, publication, and academic degrees of description. There is almost a guaranteed point of view when it comes to applying it to a specific product or series of products. Being that my discussion primarily revolves around products that run on top of the Windows operating system, my point of view, or slant, is obviously geared towards the types and toolsets that come with Windows.

Live Debugging

Live debugging refers to the mechanism of attaching to a running program or process either invasively or non-invasively. A debugger may attach to a process and wait for exceptions or set a specific breakpoint. The debugger can insert those breakpoints in once attached to the process. The easiest way to think of a breakpoint is to understand its most basic definition: a breakpoint is a place or time at which an interruption or change is made. More information on breakpoints and different breakpoint types within the Windows context can be found here: https://msdn.microsoft.com/en-us/library/windows/hardware/ff538928(v=vs.85).aspx. In addition, live debugging is also commonly used to troubleshoot and analyze code within the developer environment. In those situations the types of breakpoints will vary. For example, you can refer to the examples of breakpoints that are available within the Visual Studio development environment here: https://msdn.microsoft.com/en-us/library/bb161312.aspx. Once attached to the process, a debugger can then step through threads and functions as the application is live.

Print or Trace Debugging

This is the most common method for troubleshooting software applications and operating systems as technically, this can cover a wide scope of methods. An application can run at specific diagnostic levels generating additional output and information that can be collected into a file or database that can be used to isolate and issue. Event traces, log files, debug output all fall into this category. Strictly speaking within Windows, applications can leverage the OutputDebugString or ODS to have an application, service, or operating system component generate what is referred to as “debug spew” and you can use various tools to collect or view this debug trace information. The most popular tool for viewing ODS traces is the Debug View utility (DBGVIEW) from the Sysinternals suite (https://technet.microsoft.com/en-us/sysinternals/debugview.aspx) although this is not the only one. More information on the OutPutDebugString can be found here (https://msdn.microsoft.com/en-us/library/windows/desktop/aa363362(v=vs.85).aspx.)

In addition, there are tools that can hook into the Windows operating system to capture Win32 API and other application functions through the use of a simple user mode monitor (like the API Monitor tool) are even deeper through the use of a kernel-level filter driver (Like Process Monitor.) Literally troubleshooting outside the box and on to the wire – you can use network trafficprotocol analysis tools like Wireshark or Message Monitor (https://www.microsoft.com/en-us/download/details.aspx?id=44226) to capture network traces. These are all forms of trace debugging.

Windows Integrated Tracing and Instrumentation

Prior to Windows Vista, there were Event Logs, ODS tracing, text-based log files, etc. all within Windows each requiring their own tools and APIs. Starting with Windows 2000, Microsoft began incorporating Event Tracing for Windows (ETW) into the operating system and soon, applications and windows components were using this common engine for enabling diagnostics and collecting detailed debug tracing. Viewing of these traces was soon integrated into the Windows Event Viewer and users of App-V 5 are able to often resolve issues using this very mechanism.

The instrumentation mechanism is discussed here ( https://msdn.microsoft.com/en-us/library/zs6s4h68(v=vs.110).aspx) and probably the greatest technical reference on ETW can be found here (https://msdn.microsoft.com/en-us/library/windows/desktop/bb968803(v=vs.85).aspx)

Remote Debugging

Remote Debugging is a form of live debugging where the process of debugging occurs on a system that is different from the debugger. In most Windows cases, this is where there is an issue that needs to be debugged at the kernel mode level prior to the completion of an operating system boot or a system level crash. To start remote debugging, a debugger connects to a remote computer over a network or via a serial cable. The debugger can then control the execution of the program on the remote system and retrieve information about its state. In Windows, this is often done serially or via Firewire.

Post Mortem Debugging

Post Mortem debugging is a very common method of troubleshooting problems within software because it involves viewing a historical point-in-time snapshot of a hang, system, or application crash. This is where a debugger will read in a snapshot of debugging data called a “dump file” which contains existing memory and instruction pointers. The degrees of debugging depend on how much data is collected in the dump file as dump files can vary in what they collect. When it comes to application and system dumps, these can be controlled by the operating system’s default handlers (once called Dr. Watson for user mode applications) as to what information is collected in the dump file.

I was first introduced to the concept of post-mortem debugging reading an article by Matt Pietrek way back in 1992 in Dr. Dobbs Journal. Matt historically is one of the earliest writers on the subject of Windows debugging going back nearly three decades. The amazing thing is you can still read this article I am citing as it is available online: http://www.drdobbs.com/tools/postmortem-debugging/184408832

Execution Modes of Debugging in Windows

When we are speaking of execution modes in Windows, were talking about code that runs either in user mode or kernel mode. The execution mode affects the methodologies and tools you will leverage in order to properly debug the issue. Software is ultimately driven by the processor (CPU.) For a computer running Windows, the CPU runs in two different modes – user mode and kernel mode. The CPU switches between the two depending on the code.

Kernel Mode

The kernel and other operating system components run in kernel mode, hence the term. Rather than a macrokernel like other operating systems, Windows runs a smaller microkernel that runs as process SYSTEM. Like an application loads and uses DLL (dynamically linked libraries) the kernel also loads special modules called executive components and/or filter drivers alongside device drivers. There is essentially only one process running and that is what shows up in the Windows task Manager as “System” and if this application crashes . . . well . . . so does the entire computer. With debugging, when we are debugging in kernel mode, we are essentially debugging this process – however, it also serves as the governor of all of the other processes running on the system in user mode. All code that runs in kernel mode runs in a single virtual address space. This means that a faulty kernel-mode driver is not isolated from other drivers and the operating system itself.

User Mode

Regular applications, middleware, plug-ins, and most services run in user mode. When you start a user-mode application, Windows creates a process for the application. This process will execute one or more threads. I use the description of the process itself being innate in nature. It just owns a private virtual address space, a private handle table, and contains at least one primary thread for execution. This description of a process comes from Jeffrey Richter who has written many books on Win32 programming. Because these processes are isolated from each other, an application is unable to screw up the operation of another separated process if it crashes. Other applications and the operating system are not affected by the crash. Data can be exchanged between these processes through interprocess communication mechanisms but they cannot directly write to address spaces directly. Limiting the virtual address space of a user-mode application prevents the application from altering, and possibly damaging, critical operating system data.

App-V Tie-ins

The App-V product is especially complex when using it as an example because it contains code at both kernel and user mode. The App-V client engine consist of kernel level drivers, a primary service, and user-mode DLL’s that are injected into the processes of virtualized applications.

Next Up: Debugging Misbehaving Application Scenarios

Categories: Uncategorized Tags: AppV, appv5, dbgview, Debugging, etl, etw, procmon, Win32, windbg

On Debugging Virtual Applications: Part 1: Overview (or Let’s Start at the Beginning)

February 19, 2016 madvirtualizer 4 comments

For many application packagers, virtual application sequencers, and general IT pros, the concept of actual “debugging” can take on many meanings. Often the words “troubleshooting” and “debugging” are interspersed – especially when reading articles and blogs dealing with the topic of trying to dissect what may be occurring when a virtual application is not functioning as expected. When we speak of the word “debugging” in the context of its meaning with regards to programming and compiled software code, it is simply the dissection and reverse engineering of binaries to determine the root cause of an issue or basically “find the bug” in the code.

The level of depth may vary depending on the tools being leveraged and the amount of access to code or symbols. For example, open-sourced code projects on the web are very easy to debug because – well – the code is distributed alongside of the binaries. In addition, special files called “symbols” are also often available if need be. This is especially helpful in the world of Windows debugging. For Closed-Source binaries – like Windows, access is limited to what API’s are exposed and documented within MSDN, the SDK’s, and publically available symbol files. Still, this tremendously aids our ISV partners when they are troubleshooting issues with their own code running on top of the Windows platform.

Enter the Virtual Application

What makes this especially complex in the world of virtual applications is that the surface area expands to not only include the original application but also the virtualization engine that is maintaining its sandbox – specifically its isolation and/or state separation mechanisms. With this, you have essentially increased your variables for issues. Whereas a native application involves one vendor; running on top of another vendor’s operations system, a virtual application now deals with potentially three different vendors (not even counting the potentially amount of 3rd-party vendors that could also be hooked into the kernel via filter and device drivers.) In the case of Microsoft and App-V, if the application being virtualized is a Microsoft application, there are unlimited resources internally to work on that application. In most cases, that represents less than one-hundredth of one percent of the applications out there in the ecosystem – at best. Most cases, the application is external. When that is the case, the debugger must determine the following:

Is the application open sourced or closed source?

If the application is open-sourced, the application can be easily investigated alongside the virtualization subsystems and likely debugged pending that the individual doing the debugging understand the source code and has the proper tools to debug the application from within the share operating system (in the case of App-V – that would be Windows.)

If the application is closed source, what resources are available from the vendor?

This is where it can be challenging. When you are debugging a closed source application running virtually, it requires significant insight into the application – especially if the application is running in native code. While Microsoft makes public symbols available for ISV’s to help with debugging, often the opposite is not true. As a result, the debugging is “best-effort” at best and is usually limited to basic reverse engineering tools like Process Monitor, API Monitor, or DbgView. One exception to this – that I have encountered – have been situations where the application encounters specific issues when virtualized – and those issues cannot be reproduced on a natively installed instance of the application. In those cases, the focus can shift to the virtualization engine however, even in these situations, working in triangulation with the application vendor yields more success – much quicker.

Is the application using a 3^rd-party application virtualization engine by a vendor different from the vendor of the underlying operating system?

In this scenario, the application is written by one vendor, running on top of an operating system by a different vendor, and then sandboxed using an application virtualization by yet another vendor. In the case of Windows, the application is using a non-Microsoft virtualization solution. There have been many times where I was working support for App-V and a customer would call in with an issue they were having virtualizing a version of Office or Visual Studio on a non-Microsoft platform. I would always re-direct the customer to the vendor of the app virt stack – even though we were the vendor of the application being virtualized as well as the underlying operating system. I would then direct the customer to reach out to the Office or Visual Studio team as well to work in triangulation.

Relationships of Application to Support Vendors

When debating the best source for debugging virtual applications, please feel free to leverage the following matrix I constructed to assist you in reaching out to the most likely resources that will be able to help resolve the issue.

Application Vendor

Operating System Vendor

AppVirt Stack Vendor

Best Vendor(s) for Virt Debugging

Best Case Scenario

Vendor A

Rare in the Windows World

Vendor A

Vendor B

Vendor A

Typical 3^rd-party AppVirt Scenario

Vendor A

Vendor B

Vendor C

Vendor C first & Vendor A optional

Most Common at Microsoft

Vendor A

Vendor B

Vendor B first & Vendor A optional

The reason I make the above recommendations is because at some point the application, the application virtualization engine, or even perhaps the operating system may require some debugging – especially if there is a potential bug. If the resources troubleshooting the issue do not have access to the resources and tools needed to debug the issue – then you are essential throwing darts against the wall – and it will lead you potentially down a rabbit hole.

Why Discuss Debugging?

I have decided to start discussing the topic of virtual application debugging to serve the following purposes:

To demystify the concept for application packagers and IT Pros in the Application Virtualization space. There are tools and concepts that can help these professionals to further arm their skills and enhance their arsenals and toolboxes. Many reverse engineering tools such as ProcMon can only go so far.
To aid software vendors in how to debug applications running under App-V and how their applications may be affected.

To aid customers in how to gather and collect the appropriate debugging information to help Microsoft and other software vendors diagnose issues, isolate root cause, and resolve problems and bugs quicker.

Next Up Part 2: Types, Modes, and Situations

Categories: Uncategorized Tags: AppV, Debugging, ISV, procmon, troubleshooting, windbg

Why is Internet Explorer Crashing on Shutdown? An interesting App-V-related Issue . . .

November 20, 2012 madvirtualizer Leave a comment

Recently, I came across something very interesting. I was working with a customer who was working with several internally developed applications that leveraged HTML files by creating links that would open them inside the user’s default browser. These applications can easily be virtualized with both App-V 5.0 and 4.6 (VAE, <LOCAL_INTERACTION_ALLOWED>)

What was happening was that these applications were behaving oddly when running virtualized with App-V. The applications would trigger the local browser (running outside the bubble) for these help documents (in this example, Internet Explorer 8.) While there were no issues with this particular function, every time a user would close one of the Internet Explorer windows containing one of these documents, the window would disappear as normal. Then, almost a second or two later, a window would pop up stated that the application had crashed.

Oddly enough, we knew pretty quickly that this had to be somewhat environmental because we could never prove these issues on a vanilla test machine. This was not due to a limitation or a potential code defect within the App-V virtualization engine. After rudimentary elimination of all factors (I.E Settings, App-V, GPO, branding from the IEAK) – we decided to just cut to the chase and debug it with WINDBG to determine why we could not reproduce the issue outside the customer’s environment.

Of course there are several ways to collect user dumps (process dumps.) In this case, the issue was happening on Windows 7 so often, the default AE (Application Experience) debugger – WER (Windows Error Reporting) will suffice. We configured WER to generate a user dump by making a few registry changes.

We gave the customer the following .REG file to import into one of the offending machines.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindowsWindows Error ReportingLocalDumps]
“DumpFolder”=hex(2):63,00,3a,00,5c,00,64,00,75,00,6d,00,70,00,73,00,00,00
“DumpCount”=dword:0000000a
“DumpType”=dword:00000002

This created a full user process dump and it put the location of the user dumps in the C:Dumps folder.

Then once we had the user dump we took a look at the stack trace of the corrupting shutdown thread inside of WINDBG:

0:000> k
ChildEBP RetAddr
0013f714 77476a04 ntdll!KiFastSystemCallRet
0013f718 75656a36 ntdll!NtWaitForMultipleObjects+0xc
0013f7b4 75dfbd1e KERNELBASE!WaitForMultipleObjectsEx+0x100
0013f7fc 75dfbd8c kernel32!WaitForMultipleObjectsExImplementation+0xe0
0013f818 75e105df kernel32!WaitForMultipleObjects+0x18
0013f884 75e1087a kernel32!WerpReportFaultInternal+0x186
0013f898 75e10828 kernel32!WerpReportFault+0x70
0013f8a8 75e107a3 kernel32!BasepReportFault+0x20
0013f934 774a7f02 kernel32!UnhandledExceptionFilter+0x1af
0013f93c 7744e324 ntdll!__RtlUserThreadStart+0x62
0013f950 7744e1b4 ntdll!_EH4_CallFilterFunc+0x12
0013f978 77477199 ntdll!_except_handler4+0x8e
0013f99c 7747716b ntdll!ExecuteHandler2+0x26
0013f9c0 7744f98f ntdll!ExecuteHandler+0x24
0013fa4c 77476ff7 ntdll!RtlDispatchException+0x127
0013fa4c 5483ccd4 ntdll!KiUserExceptionDispatcher+0xf
WARNING: Frame IP not in any known module. Following frames may be wrong.
0013fd60 7748d690 <Unloaded_PseudoServerInproc.dll>+0xccd4
0013fd7c 7748e3d9 ntdll!RtlProcessFlsData+0x57
0013fe14 7748e12f ntdll!LdrShutdownProcess+0xbd
0013fe28 75e0bbd6 ntdll!RtlExitUserProcess+0x74
0013fe3c 775836dc kernel32!ExitProcessStub+0x12
0013fe48 77583371 msvcrt!__crtExitProcess+0x17
0013fe80 775836bb msvcrt!doexit+0xac
0013fe94 0103129e msvcrt!exit+0x11
0013ff1c 75dfed4c iexplore!__wmainCRTStartup+0x164
0013ff28 7749377b kernel32!BaseThreadInitThunk+0xe
0013ff68 7749374e ntdll!__RtlUserThreadStart+0x70
0013ff80 00000000 ntdll!_RtlUserThreadStart+0x1b

There was this external DLL (not part of the standard Windows build) that was loaded into the stack and ad appeared to be partially or fully unloaded by the time that WER could capture the exception. We wanted to see if this DLL was being injected by way of AppInit_DLLs key. At the time, we did not know what this particular DLL was part of. All we knew was PseudoServerInproc.dll appears to be an unknown DLL that was injected into the IE process

We went to AppInits_DLL in the registry and found the MFAHook which is a commonly known master hook DLL used in Citrix products. We disengaged all DLLS in that Key by using the following .REG file:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindows NTCurrentVersionWindows]

“LoadAppInit_DLLs”=dword:00000000”

The issue immediately went away. Now that we knew it was tied to one of ythe Citrix products being leveraged on the machine, we went back to the AppInit_DLL key to examine MFAHook. This will require us going further to investigate which specific hook DLL is the issue. We know the DLL was PseudoServerInproc.dll

So we went into the Citrix configuration to get all of the specific hook agents and the processes they inject into and found our DLL under the following registry key:

Key: HKEY_LOCAL_MACHINESOFTWARECitrixCtxHookAppInit_DllsHDXMediaStreamForFlash
Value: FilePathName
Data: C:\Program Files\Citrix\ICAService\PseudoServerInproc.dll

A subkey denotes the exe’s it hooks into:

HKEY_LOCAL_MACHINESOFTWARECitrixCtxHookAppInit_DllsHDXMediaStreamForFlashiexplore.exe

We found that by deleting it – this also fixed the issue. Upon further investigation with Citrix, we found that this was related to a known issue with one of there products from their VDI suite – AND – they already had a fix for this issue (which worked like a champ!~)

More info here:

http://forums.citrix.com/thread.jspa?threadID=307236&tstart=0

On a side note, if you want to use a more extensive tool for collecting these kind of crashes for analysis, I would highly encourage you to download ProcDump and configure it to be your default application experience debugger. You can enable it by importing the following .REG file:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindows NTCurrentVersionAeDebug]
“Auto”=”1”
“Debugger”=”C:\procdump\procdump.exe /accepteula -ma %ld C:\Dumps”
“UserDebuggerHotKey”=dword:00000000

[HKEY_LOCAL_MACHINESOFTWAREWoW6432NodeMicrosoftWindows NTCurrentVersionAeDebug]
“Auto”=”1”
“Debugger”=”C:\procdump\procdump.exe /accepteula -ma %ld C:\Dumps”
“UserDebuggerHotKey”=dword:00000000

[HKEY_LOCAL_MACHINESOFTWAREMicrosoft.NETFrameworkDbgManagedDebugger]
“Auto”=”1”
“Debugger”=”C:\procdump\procdump.exe /accepteula -ma %ld C:\Dumps”
“UserDebuggerHotKey”=dword:00000000

[HKEY_LOCAL_MACHINESOFTWAREWow6432NodeMicrosoft.NETFrameworkDbgManagedDebugger]
“Auto”=”1”
“Debugger”=”C:\procdump\procdump.exe /accepteula -ma %ld C:\Dumps”
“UserDebuggerHotKey”=dword:00000000

This will register Procdump as the default post-mortem debugger instead of WER. It will intercept the exception and will store a dump file in the specified folder.

Categories: Uncategorized Tags: AppInit_DLLs, AppV, citrix, IE, procdump, vae, wer, windbg

The Mad Virtualizer at Microsoft

Archive

On Debugging Virtual Applications: Part 3 – Situations where a Debugger is most needed for Virtual Applications

On Debugging Virtual Applications: Part 2: Types and Modes

On Debugging Virtual Applications: Part 1: Overview (or Let’s Start at the Beginning)

Why is Internet Explorer Crashing on Shutdown? An interesting App-V-related Issue . . .

Author

Twitter Feed

Top Posts & Pages

Archives

MadVirtualizer