Location>code7788 >text

Android Stability (2): Governance Ideas

Popularity:668 ℃/2025-01-13 16:07:48

This article was simultaneously published on the public account: Mobile Development Things:Android Stability (2): Governance Ideas

Generally speakingAndroidStability includescrashandANR, this article mainly focuses oncrash(appliedcrashrate) to tell how to do itAndroidstability-related work. Before talking about specific ideas, let’s first understandAndroidException catching mechanism

1 Exception catching mechanism

AndroidThe exception catching mechanism in can be divided from the language level intojavalayer andnative(C++)layer.

1.1 java exception catching mechanism

1.1.1 Basics

ThrowableIt is the base class of all exceptions and has two important subclasses:

  • Error: Serious system errors, such asOOM, general applications have no way to process it
  • Exception:Exceptions that can be caught and handled by the application, such asNPE

Generally what we deal with in the code isExceptionRelated exceptions, and these exceptions are divided into two categories according to whether they need to be processed during the compilation stage:

  • Checked exception: It needs to be handled during the compilation stage, otherwise the code will not pass the compilation (usually passestry-catchOr used in method signaturethrowsstatement will throw this exception), such as during file operationsIOException;
  • Unchecked exceptions: Exceptions that are not required to be processed during the compilation phase but will occur at runtime, including runtime exceptionsRuntimeExceptionand its subcategories

1.1.2 Use

In addition to the commonly used ones in the code, throughtry-catchIn addition to wrapping the problematic code blocks that may cause exceptions, you can also useCapture application-wide exceptions. For example inApplication This method is set in the class. When an unhandled exception occurs, the exception information can be recorded to facilitate subsequent analysis. This is usually achieved through custom classesUncaughtExceptionHandlerInterface to implement global exception handling:

class ACrash implements {
 public UncaughtExceptionHandler exceptionHandler;
 @Override
 public void uncaughtException(@NonNull Thread t, @NonNull Throwable e) {
        // Customized processing is done here based on the exception type and thread.

        // After processing the custom logic, determine whether to continue the exception to the original exception handler
        if (exceptionHandler != null) {
        (t, e)
        }

     }
 }

When setting up a custom exception handling interface, one thing to note is that if you use a third-partycrashcollection system, likebugly,acrc, when setting the exception handler, you need to pay attention to whether it has been set:

UncaughtExceptionHandler tmpHandler = ()
 ACrash aHandle = ACrash()
 // To retain the original exception handling
  = tmpHandler
 (aHandle)

1.2 native exception catching mechanism

1.2.1 Basics

nativeLayer's exception catching mechanism, in addition to similartry-catchandthrowIn addition to throwing exceptions, there is also a system-level signal distribution and processing mechanism. The system will notify exception information by distributing signals, so exception processing is a signal processing.
Generally passablesigactionfunction to register information processing functions:

static void signalHandler(int signal, siginfo_t *info, void *reserved) {
     // handle signal
 }
 void initSignalHandler() {
     struct sigaction action;
     action.sa_flags = SA_SIGINFO;
     action.sa_sigaction = signalHandler;
     sigaction(SIGSEGV, &action, NULL); // Capture segfault
 }

Common semaphores:

  • SIGSEGV 11 Invalid memory reference
  • SIGABRT 6 Exit command issued by abort
  • SIGFPE 8 C floating point exception
  • SIGILL 4 illegal command
  • SIGBUS 10, 7, bus error (memory error)
  • SIGKILL 9 kill signal

1.2.2 Use

nativeAfter the exception is caught, it also involvesminidumpObtaining files and restoring the entire stack are relatively complicated, so generally we do not directly register signal monitoring ourselves to handle it, but use third-party solutions, such asbugly, or usebreakpadlibrary to handle (buglyThe bottom layer is also usedbreakpad),breakpadFor usage, please refer to:How to use Google Breakpad on Android platform

2 Classification and governance ideas

2.1 Classification

In addition to regular business code optimization (like regularnpe,indexoutofboundsexceptionetc.) From the perspective of the operating system, stability optimization issues can be roughly divided into the following categories:

  • Memory stability optimization
  • Thread stability optimization
  • System problem optimization

2.2 Governance ideas

Stability governance actually ultimately serves users, so the entire governance is also centered aroundImprove user experienceto expand. The main ideas of governance are:

  • If it can be repaired, try to repair it (like npe, oom);
  • If there is no way to fix it from a business perspective (like system bugs), try to minimize the impact on users and downgrade some exceptions.

Improving user experience is nothing more than reducing applicationcrashRate, here is a core principle:The main energy should be spent onTop 10Top20problem analysis, solve the main problems and solve some long-tail problems by the way;

2.2.1 Memory management

The management of memory issues mainly focuses on:Reduce running memory usage as much as possible while avoiding memory leaks, memory overflow problem. Here we need to use some tools to help us determine what the possible problem is:

  • leakcanary: Square’s open source tool for detecting and diagnosing memory leaks in Android applications, used in the development process;
  • KOOM: Kuaishou’s online memory monitoring solution can help better optimize application memory
  • Profiler: Android’s own performance monitoring tool that can assist in memory analysis
    (There are also other tools for memory analysis in the industry. You can choose the appropriate one as long as it can solve the problem)

For some ways to optimize memory, please refer to the previous articleAndroid Stability (1): Memory Usage Guide

2.2.2 Thread management

Thread management mainly focuses on:

  • Reuse threads and use thread pools for scheduling as much as possible (different businesses will use different thread pool strategies);
  • Thread recycling. After the thread is finished using, shutdown must be called in time. If there are variables holding the thread, they must be cleared in time.

Here, the author has tried several optimization directions in my business:

  • limitOkHttpClientThe maximum number of threads to avoid unlimited growth and reuse the same thread as much as possibleOKHttpClient
  • Convergence threads provide several methods to obtain threads from the thread pool to avoid business directnew
  • Thread pool initializedcoreThreads perform differentiated initialization according to different businesses;
  • Prevent the thread instance from being held by a singleton, resulting in no way to release resources after the thread is closed;

2.2.3 System problem management

becauseAndroidVersion fragmentation problem, you will encounter various problems that only collect the system stackcrashThe problem cannot be solved from the business level. Here we can only analyze the specific problems in detail. For system problems, there is a big governance (analysis) idea:Cluster the system versions in which the problem occurs to determine whether it is a problem with a specific version or a general problem., after making a guess, then analyze the source code of the corresponding version to verify the guess (hypothesis -> determine the problem -> solve the problem)

Available to view onlineAndroidSource code address:Android Code Search

After identifying the problem, when thinking about how to solve the problem (system problems are highly likely to be impossible to cure and can only minimize the impact on users), here is a big framework:

  • can passhookThe system interface is processed throughhookSystem interface to handle (generally need to be inC++layerhook),like:
    • Expand the limits of the system, e.g.Android 8.1The system's file descriptor limit is 1024, which can be passedhookThe interface is expanded to4096, can be reduced due tofdProblems caused by overflow;
    • Reduce the level impact of the system, such asRenderThreadThe question is caused byabartThe crash is reduced to frame loss;
  • no way to passhookThe system handles:
    • Exceptions that do not affect the user, such asDeadSystemException,FinalizerWatchdogDaemonTimeoutThis category can be directly implemented in the business layercatchLive (please refer to the previous exception catching mechanism)
    • Exceptions that affect user experience should still be removedcrashlogic

3 Summary

This article focuses onAndroidappliedcrashrate elaboratedAndroidFor stability-related work, we first introduce the exception capture mechanism, and then propose ideas for classification and governance, aiming to reduce application crash rates and improve user experience.
Through the reasonable setting and optimization strategy of the exception capture mechanism, focusing on the Top10 and Top20 problems of the application, and classifying and managing the problems, the stability of the application can be effectively improved.