Keep It Simple, Stupid

简单 快乐 追求

Maven 源码分析 (二)

| Comments

前言


上一篇文章Maven 源码分析 (一)介绍了我们如何开始分析Maven源代码,也介绍到了Maven的启动需要依赖与plexus-classworlds这样的类加载框架来负责加载Maven运行命令需要的jar。

在继续本篇的话题之前需要了解Maven包文件中的文件目录以及说明。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
xxx-MBP-2:apache-maven-3.1.1 xxx$ tree
.
├── LICENSE
├── NOTICE
├── README.txt
├── bin
│   ├── m2.conf
│   ├── mvn
│   ├── mvn.bat
│   ├── mvnDebug
│   ├── mvnDebug.bat
│   └── mvnyjp
├── boot
│   └── plexus-classworlds-2.5.1.jar
├── conf
│   ├── logging
│   │   └── simplelogger.properties
│   └── settings.xml
└── lib
    ├── aether-api-0.9.0.M2.jar
    ├── 此处省略很多的jar文件
    ├── plexus-interpolation-1.19.jar
    ├── plexus-sec-dispatcher-1.3.jar
    ├── plexus-sec-dispatcher.license
    ├── plexus-utils-3.0.15.jar
    ├── sisu-guice-3.1.3-no_aop.jar
    ├── slf4j-api-1.7.5.jar
    ├── slf4j-api.license
    ├── slf4j-simple-1.7.5.jar
    ├── slf4j-simple.license
    ├── wagon-file-2.4.jar
    ├── wagon-http-2.4-shaded.jar
    └── wagon-provider-api-2.4.jar

bin目录,包含了mvn运行的脚本,这些脚本用来配置Java命令,准备好classpath和相关的Java系统属性,然后执行Java命令。其中mvn是基于UNIX平台的shell脚本,mvn.bat是基于Windows平台的bat脚本。在命令行输入任何一条mvn命令时,实际上就是在调用这些脚本。该目录还包含了mvnDebug和mvnDebug.bat两个文件,同样,前者是UNIX平台的shell脚本,后者是windows的bat脚本。那么mvn和mvnDebug有什么区别和关系呢?打开文件我们就可以看到,两者基本是一样的,只是mvnDebug多了一条MAVEN_DEBUG_OPTS配置,作用就是在运行Maven时开启debug,以便调试Maven本身。此外,该目录还包含m2.conf文件,是plexus-classworlds的配置文件。

boot目录,只包含一个文件,plexus-classworlds是一个类加载器框架,相对于默认的java类加载器,它提供了更丰富的语法以方便配置,Maven使用该框架加载自己的类库。更多classworlds的信息请参考classworlds,本篇会大篇幅的介绍该框架。

conf目录,包含了一个非常重要的文件settings.xml。直接修改该文件,就能在机器上全局地定制Maven的行为。一般情况下,我们更偏向于复制该文件至~/.m2/目录下(这里~表示用户目录),然后修改该文件,在用户范围定制Maven的行为。

lib目录,包含了所有Maven运行时需要的Java类库。

其他文件略

plexus-classworld 是啥?


Maven 源码分析 (一)

| Comments

开篇


Main 函数在哪里?


我们先看看apache-maven-3.1.1下mvn脚本,源文件在mvn

看到该文件的最后

1
2
3
4
5
6
7
8
CLASSWORLDS_LAUNCHER=org.codehaus.plexus.classworlds.launcher.Launcher

exec "$JAVACMD" \
  $MAVEN_OPTS \
  -classpath "${M2_HOME}"/boot/plexus-classworlds-*.jar \
  "-Dclassworlds.conf=${M2_HOME}/bin/m2.conf" \
  "-Dmaven.home=${M2_HOME}"  \
  ${CLASSWORLDS_LAUNCHER} "$@"

可以看到这里用到了plexus-classworlds 类加载框架,启动maven的Laucher#main

源文件参考m2.conf

1
2
3
4
5
6
7
8
9
10
main is org.apache.maven.cli.MavenCli from plexus.core

// 注释掉 // set maven.home default ${user.home}/m2

set maven.home default /Users/yangtao/maven-tutorial/apache-maven-3.1.1

[plexus.core]
optionally ${maven.home}/lib/ext/*.jar
load       ${maven.home}/lib/*.jar
load       ${maven.home}/conf/logging

在这里,我们可以试着通过debug代码来逐步了解。 本文的案例是在intellij里面操作的,那么我们需要在启动Laucher.main()的时候,设置部分参数;

图二

Alt text

下面我来仔细看看Laucher.main()做了什么

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public static void main( String[] args )
    {
        try
        {
           // 1
            int exitCode = mainWithExitCode( args );

            System.exit( exitCode );
        }
        catch ( Exception e )
        {
            e.printStackTrace();

            System.exit( 100 );
        }
    }

标记为1处,处理传入参数并返回退出代码,我们细致的来看下mainWithExitCode做了什么?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
public static int mainWithExitCode( String[] args )
        throws Exception
    {
        String classworldsConf = System.getProperty( CLASSWORLDS_CONF );
      
      // 此处省略读classworlds_conf参数的处理
        // 1
        launcher.configure( is );

        is.close();

        try
        {
            // 2
            launcher.launch( args );
        }
        catch ( InvocationTargetException e )
        {
            ClassRealm realm = launcher.getWorld().getRealm( launcher.getMainRealmName() );

            URL[] constituents = realm.getURLs();

            // 省略 异常处理细节
            // Else just toss the ITE
            throw e;
        }

        return launcher.getExitCode();
    }

标记1处,读m2.conf的配置文件,并进行解析。

标记2处,反射调用MavenCli.main(),去执行mvn相关的命令。

下一篇文章主要介绍plexus-classworlds是怎么来解析配置并调用指定主函数的?Maven 源码分析 (二)

Karaf应用中配置数据源

| Comments

我厂主要系统采用模块化开发以来,就遇到诸多的问题,不过都还可以解决。模块化我们选用了karaf这个集成组件。

开发中遇到数据源的配置问题。主要表现在每一位开发的数据库都是分开的,线上产品的数据库也是分开,线上数据库通过CMDB工具可以取到相应配置。

要解决以上需求,我们将开发和线上的datasource分成了两个独立的模块,开发对应模块xx-datasource-dev,那么线上对应的就是xx-datasource-production. 为什么要分成两个项目? 很明显,我们开发环境中没有CMDB这样工具来集中管理开发人员的数据库,目前还是开发自己在玩自己的数据库。

这样我们的开发数据库基本上依赖maven来构建。

线上的依赖CMDB在初始化客户系统的时候,创建好数据源。看起来一切很ok,但是没法重启。(因为部署的时候消息触发来取CMDB的配置信息, 重启只是重启整个karaf)

我们考虑到了,需要将线上初始化取到的CMBD配置进行持久化(写文件)。 那么在karaf环境下如何写文件和读文件呢?

我们参考了karaf官方手册的介绍。

实例

首先需要写一个POJO来处理datasource的基础属性,我们这里简单处理只设置driver | url | username | password 4个属性。

程序代码参见DatasourceConfiguration

定义一个读配置的service,可以在datasource bundle激活器被触发时候来设置数据源并进行服务注册。

程序代码参见DatasourceConfigurationQuery

程序很简单,关键是配置,目前karaf默认配置应用的是blueprint

配置代码参见datasourceConfiguration.xml

以上配置是比较关键的。在整个bundle加载的时候会将karaf的/etc/datasource.prop.cfg文件对应起来,并可以通过config命令进行写入值,也可以读取值。

对于在发布该bundle的时候,需要将对应的文件也写入到指定的文件目录:

代码参见features.xml

另外要考虑的如何才能符合上述的格式进行安装(configfile).

需要在编译安装的时候进行生成,在pom.xml需要用到插件。

代码参见pom.xml

这样写好后,只需要在karaf里面安装该bundle,然后去指定文件目录确定该文件是否生成。生成后,就可以用karaf提供的config命令来进行设置值了。

在后续的datasource bundle的激活器中进行存储和获取。 存取方式分为2种:

  1. config:list | config:edit
  2. 应用ConfigurationAdmin 程序实现
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
private ConfigurationAdmin configAdmin;

private void updateDatasourceProp(final DatasourceProp datasourceProp) {
        try {
            Configuration configuration = configAdmin.getConfiguration(DatasourceProp.PID);
            Dictionary<String,Object> props = new Hashtable<String,Object>();
            props.put(DatasourceProp.URL_KEY, datasourceProp.getUrl());
            props.put(DatasourceProp.USERNAME_KEY, datasourceProp.getUsername());
            props.put(DatasourceProp.PASSWORD_KEY, datasourceProp.getPassword());
            props.put(DatasourceProp.MIGRATION_KEY, datasourceProp.getMigration());
            configuration.setBundleLocation(null);
            configuration.update(props);
        } catch (final Exception e) {
            logger.error("reason : datasourceProp update failure", e);
            throw new ExceptionInInitializerError(e);
        }
    }
    
<reference id="configAdmin" interface="org.osgi.service.cm.ConfigurationAdmin"/>

Understanding Java Garbage Collection

| Comments

原文

What are the benefits of knowing how garbage collection (GC) works in Java? Satisfying the intellectual curiosity as a software engineer would be a valid cause, but also, understanding how GC works can help you write much better Java applications.

This is a very personal and subjective opinion of mine, but I believe that a person well versed in GC tends to be a better Java developer. If you are interested in the GC process, that means you have experience in developing applications of certain size. If you have thought carefully about choosing the right GC algorithm, that means you completely understand the features of the application you have developed. Of course, this may not be common standards for a good developer. However, few would object when I say that understanding GC is a requirement for being a great Java developer.

This is the first of a series of “Become a Java GC Expert” articles. I will cover the GC introduction this time, and in the next article, I will talk about analyzing GC status and GC tuning examples from NHN.

The purpose of this article is to introduce GC to you in an easy way. I hope this article proves to be very helpful. Actually, my colleagues have already published a few great articles on Java Internals which became quite popular on Twitter. You may refer to them as well.

Returning back to Garbage Collection, there is a term that you should know before learning about GC. The term is “stop-the-world.” Stop-the-world will occur no matter which GC algorithm you choose. Stop-the-world means that the JVM is stopping the application from running to execute a GC. When stop-the-world occurs, every thread except for the threads needed for the GC will stop their tasks. The interrupted tasks will resume only after the GC task has completed. GC tuning often means reducing this stop-the-world time.

Generational Garbage Collection

Java does not explicitly specify a memory and remove it in the program code. Some people sets the relevant object to null or use System.gc() method to remove the memory explicitly. Setting it to null is not a big deal, but calling System.gc() method will affect the system performance drastically, and must not be carried out. (Thankfully, I have not yet seen any developer in NHN calling this method.)

In Java, as the developer does not explicitly remove the memory in the program code, the garbage collector finds the unnecessary (garbage) objects and removes them. This garbage collector was created based on the following two hypotheses. (It is more correct to call them suppositions or preconditions, rather than hypotheses.)

  • Most objects soon become unreachable.
  • References from old objects to young objects only exist in small numbers.

These hypotheses are called the weak generational hypothesis. So in order to preserve the strengths of this hypothesis, it is physically divided into two - young generation and old generation - in HotSpot VM.

Young generation: Most of the newly created objects are located here. Since most objects soon become unreachable, many objects are created in the young generation, then disappear. When objects disappear from this area, we say a “minor GC” has occurred.

Old generation: The objects that did not become unreachable and survived from the young generation are copied here. It is generally larger than the young generation. As it is bigger in size, the GC occurs less frequently than in the young generation. When objects disappear from the old generation, we say a “major GC” (or a “full GC”) has occurred.

Let’s look at this in a chart.

Alt text

The permanent generation from the chart above is also called the “method area,” and it stores classes or interned character strings. So, this area is definitely not for objects that survived from the old generation to stay permanently. A GC may occur in this area. The GC that took place here is still counted as a major GC.

Some people may wonder:

What if an object in the old generation need to reference an object in the young generation?

To handle these cases, there is something called the a “card table” in the old generation, which is a 512 byte chunk. Whenever an object in the old generation references an object in the young generation, it is recorded in this table. When a GC is executed for the young generation, only this card table is searched to determine whether or not it is subject for GC, instead of checking the reference of all the objects in the old generation. This card table is managed with write barrier. This write barrier is a device that allows a faster performance for minor GC. Though a bit of overhead occurs because of this, the overall GC time is reduced.

Alt text

Composition of the Young Generation

In order to understand GC, let’s learn about the young generation, where the objects are created for the first time. The young generation is divided into 3 spaces.

  • One Eden space
  • Two Survivor spaces

There are 3 spaces in total, two of which are Survivor spaces. The order of execution process of each space is as below:

  1. The majority of newly created objects are located in the Eden space.
  2. After one GC in the Eden space, the surviving objects are moved to one of the Survivor spaces.
  3. After a GC in the Eden space, the objects are piled up into the Survivor space, where other surviving objects already exist.
  4. Once a Survivor space is full, surviving objects are moved to the other Survivor space. Then, the Survivor space that is full will be changed to a state where there is no data at all.
  5. The objects that survived these steps that have been repeated a number of times are moved to the old generation.

As you can see by checking these steps, one of the Survivor spaces must remain empty. If data exists in both Survivor spaces, or the usage is 0 for both spaces, then take that as a sign that something is wrong with your system.

The process of data piling up into the old generation through minor GCs can be shown as in the below chart:

Alt text

Note that in HotSpot VM, two techniques are used for faster memory allocations. One is called “bump-the-pointer,” and the other is called “TLABs (Thread-Local Allocation Buffers).”

Bump-the-pointer technique tracks the last object allocated to the Eden space. That object will be located on top of the Eden space. And if there is an object created afterwards, it checks only if the size of the object is suitable for the Eden space. If the said object seems right, it will be placed in the Eden space, and the new object goes on top. So, when new objects are created, only the lastly added object needs to be checked, which allows much faster memory allocations. However, it is a different story if we consider a multithreaded environment. To save objects used by multiple threads in the Eden space for Thread-Safe, an inevitable lock will occur and the performance will drop due to the lock-contention. TLABs is the solution to this problem in HotSpot VM. This allows each thread to have a small portion of its Eden space that corresponds to its own share. As each thread can only access to their own TLAB, even the bump-the-pointer technique will allow memory allocations without a lock.

This has been a quick overview of the GC in the young generation. You do not necessarily have to remember the two techniques that I have just mentioned. You will not go to jail for not knowing them. But please remember that after the objects are first created in the Eden space, and the long-surviving objects are moved to the old generation through the Survivor space.

GC for the Old Generation

The old generation basically performs a GC when the data is full. The execution procedure varies by the GC type, so it would be easier to understand if you know different types of GC.

According to JDK 7, there are 5 GC types.

  1. Serial GC
  2. Parallel GC
  3. Parallel Old GC (Parallel Compacting GC)
  4. Concurrent Mark & Sweep GC (or “CMS”)
  5. Garbage First (G1) GC

Among these, the serial GC must not be used on an operating server. This GC type was created when there was only one CPU core on desktop computers. Using this serial GC will drop the application performance significantly.

Now let’s learn about each GC type.

Serial GC (-XX:+UseSerialGC)

The GC in the young generation uses the type we explained in the previous paragraph. The GC in the old generation uses an algorithm called “mark-sweep-compact.”

  1. The first step of this algorithm is to mark the surviving objects in the old generation.
  2. Then, it checks the heap from the front and leaves only the surviving ones behind (sweep).
  3. In the last step, it fills up the heap from the front with the objects so that the objects are piled up consecutively, and divides the heap into two parts: one with objects and one without objects (compact).

The serial GC is suitable for a small memory and a small number of CPU cores.

Parallel GC (-XX:+UseParallelGC)

Alt text

From the picture, you can easily see the difference between the serial GC and parallel GC. While the serial GC uses only one thread to process a GC, the parallel GC uses several threads to process a GC, and therefore, faster. This GC is useful when there is enough memory and a large number of cores. It is also called the “throughput GC.”

Parallel Old GC(-XX:+UseParallelOldGC)

Parallel Old GC was supported since JDK 5 update. Compared to the parallel GC, the only difference is the GC algorithm for the old generation. It goes through three steps: mark – summary – compaction. The summary step identifies the surviving objects separately for the areas that the GC have previously performed, and thus different from the sweep step of the mark-sweep-compact algorithm. It goes through a little more complicated steps.

CMS GC (-XX:+UseConcMarkSweepGC)

Alt text

As you can see from the picture, the Concurrent Mark-Sweep GC is much more complicated than any other GC types that I have explained so far. The early initial mark step is simple. The surviving objects among the objects the closest to the classloader are searched. So, the pausing time is very short. In the concurrent mark step, the objects referenced by the surviving objects that have just been confirmed are tracked and checked. The difference of this step is that it proceeds while other threads are processed at the same time. In the remark step, the objects that were newly added or stopped being referenced in the concurrent mark step are checked. Lastly, in the concurrent sweep step, the garbage collection procedure takes place. The garbage collection is carried out while other threads are still being processed. Since this GC type is performed in this manner, the pausing time for GC is very short. The CMS GC is also called the low latency GC, and is used when the response time from all applications is crucial.

While this GC type has the advantage of short stop-the-world time, it also has the following disadvantages.

  1. It uses more memory and CPU than other GC types.
  2. The compaction step is not provided by default.

You need to carefully review before using this type. Also, if the compaction task needs to be carried out because of the many memory fragments, the stop-the-world time can be longer than any other GC types. You need to check how often and how long the compaction task is carried out.

G1 GC

Finally, let’s learn about the garbage first (G1) GC.

Alt text

If you want to understand G1 GC, forget everything you know about the young generation and the old generation. As you can see in the picture, one object is allocated to each grid, and then a GC is executed. Then, once one area is full, the objects are allocated to another area, and then a GC is executed. The steps where the data moves from the three spaces of the young generation to the old generation cannot be found in this GC type. This type was created to replace the CMS GC, which has causes a lot of issues and complaints in the long term.

The biggest advantage of the G1 GC is its performance. It is faster than any other GC types that we have discussed so far. But in JDK 6, this is called an early access and can be used only for a test. It is officially included in JDK 7. In my personal opinion, we need to go through a long test period (at least 1 year) before NHN can use JDK7 in actual services, so you probably should wait a while. Also, I heard a few times that a JVM crash occurred after applying the G1 in JDK 6. Please wait until it is more stable.

I will talk about the GC tuning in the next issue, but I would like to ask you one thing in advance. If the size and the type of all objects created in the application are identical, all the GC options for WAS used in our company can be the same. But the size and the lifespan of the objects created by WAS vary depending on the service, and the type of equipment varies as well. In other words, just because a certain service uses the GC option “A,” it does not mean that the same option will bring the best results for a different service. It is necessary to find the best values for the WAS threads, WAS instances for each equipment and each GC option by constant tuning and monitoring. This did not come from my personal experience, but from the discussion of the engineers making Oracle JVM for JavaOne 2010.

In this issue, we have only glanced at the GC for Java. Please look forward to our next issue, where I will talk about how to monitor the Java GC status and tune GC.

I would like to note that I referred to a new book released in December 2011 called “Java Performance” (Amazon, it can also be viewed from safari online, if the company provides an account), as well as “Memory Management in the Java HotSpotTM Virtual Machine,” a white paper provided by the Oracle website. (The book is different from “Java Performance Tuning.”)

By Sangmin Lee, Senior Engineer at Performance Engineering Lab, NHN Corporation.

How to Analyze Java Thread Dumps

| Comments

The content of this article was originally written by Tae Jin Gu on the Cubrid blog.

该文来源于Cubrid blog(不过源地址已没有相关内容,本文翻译系转载)

When there is an obstacle, or when a Java based Web application is running much slower than expected, we need to use thread dumps. If thread dumps feel like very complicated to you, this article may help you very much. Here I will explain what threads are in Java, their types, how they are created, how to manage them, how you can dump threads from a running application, and finally how you can analyze them and determine the bottleneck or blocking threads. This article is a result of long experience in Java application debugging.

当Java程序遇到了麻烦,或者Java编写的Web应用运行得比预期情况慢,就可以使用thread dumps这一利器分析问题.如果你觉得thread dumps很复杂,那么阅读本文会对你有所帮助.本文将介绍什么是Java线程,线程的种类,如何创建,如何管理,如何对正在运行的程序进行thread dumps,如何分析,最终找到瓶颈或阻塞线程.本文总结了长期调试Java程序的经验.

Java and Thread

java和线程

A web server uses tens to hundreds of threads to process a large number of concurrent users. If two or more threads utilize the same resources, a contention between the threads is inevitable, and sometimes deadlock occurs.

Web服务器通常创建很多线程去处理高并发的用户请求.如果两三个线程使用同样的服务器资源,线程间的竞争就不可避免了,更糟糕的情况下还会发生死锁.

Thread contention is a status in which one thread is waiting for a lock, held by another thread, to be lifted. Different threads frequently access shared resources on a web application. For example, to record a log, the thread trying to record the log must obtain a lock and access the shared resources.

线程竞争是指一个线程等待另一个线程持有的锁.Web服务的不同线程会频繁地访问共享资源,比如记录日志:一个试图记录日志的线程必须先获得锁才能访问共享资源.

Deadlock is a special type of thread contention, in which two or more threads are waiting for the other threads to complete their tasks in order to complete their own tasks.

死锁是线程竞争的一个特殊形式.多个线程都在等待其它线程先于自己完成任务,于是陷入无尽的等待.

Different issues can arise from thread contention. To analyze such issues, you need to use the thread dump. A thread dump will give you the information on the exact status of each thread.

线程竞争会导致很多问题,我们需要使用thread dumps来解决,它会提供每个线程状态的确切信息.

Background Information for Java Threads

Java线程的背景知识

Thread Synchronization

线程同步

A thread can be processed with other threads at the same time. In order to ensure compatibility when multiple threads are trying to use shared resources, one thread at a time should be allowed to access the shared resources by using thread synchronization.

多个线程可以同时运行,为了让使用共享资源的多个线程可以和平共处,同一时刻只允许一个线程访问共享资源,这是通过线程同步来实现的.

Thread synchronization on Java can be done using monitor. Every Java object has a single monitor. The monitor can be owned by only one thread. For a thread to own a monitor that is owned by a different thread, it needs to wait in the wait queue until the other thread releases its monitor.

Java线程同步使用了管程.每个Java对象都有一个管程.每个管程只能被一个线程持有.如果一个线程想持有另一个线程持有的管程,它需要排队等待另一个线程去释放锁.

Thread Status

线程的状态

In order to analyze a thread dump, you need to know the status of threads. The statuses of threads are stated on java.lang.Thread.State.

为了分析thread dump,我们需要知道线程的状态,这些状态用java.lang.Thread.State来表示:

Alt text

Figure 1: Thread Status.

  • NEW: The thread is created but has not been processed yet.
  • RUNNABLE: The thread is occupying the CPU and processing a task. (It may be in WAITING status due to the OS’s resource distribution.)
  • BLOCKED: The thread is waiting for a different thread to release its lock in order to get the monitor lock.
  • WAITING: The thread is waiting by using a wait, join or park method.
  • TIMED_WAITING: The thread is waiting by using a sleep, wait, join or park method. (The difference from WAITING is that the maximum waiting time is specified by the method parameter, andWAITING can be relieved by time as well as external changes.)

  • NEW: 线程已经被建立,但是还没有执行.

  • RUNNABLE: 线程正使用CPU来处理任务(因为操作系统的资源调度原因,线程可以处于WAITING状态)
  • BLOCKED: 线程正在等待另一个线程释放锁.
  • WAITING: 线程因为使用了wait, join 或 park 方法而处于等待状态.
  • TIMED_WAITING: 线程因为使用了sleep, wait, join 或 park 方法而处于等待状态. (和 WAITING 状态的区别是, 除了外部条件改变之外, 此状态有超时机制, 过了给定的最长等待时间, 此等待状态就解除了)

Thread Types

线程种类

Java threads can be divided into two:

  1. daemon threads;
  2. and non-daemon threads.

Java的线程可分为两类:

  1. 监控线程
  2. 非监控线程

Daemon threads stop working when there are no other non-daemon threads. Even if you do not create any threads, the Java application will create several threads by default. Most of them are daemon threads, mainly for processing tasks such as garbage collection or JMX.

当没有其它非监控线程继续运行时,监控线程才会停止.即使你没有创建任何线程,Java程序也会创建几个默认线程.默认线程大多是监控线程,主要是为了垃圾回收或者JMX.

A thread running the ‘static void main(String[] args)’ method is created as a non-daemon thread, and when this thread stops working, all other daemon threads will stop as well. (The thread running this main method is called the VM thread in HotSpot VM.)

Java线程同步使用了管程.每个Java对象都有一个管程.每个管程只能被一个线程持有.如果一个线程想持有另一个线程持有的管程,它需要在排队等待另一个线程去释放.执行main方法的线程时非监控线程,当它结束时,所有的监控线程也都要停止.(在HotSpot VM里,这个线程叫做VM Thread).

Getting a Thread Dump

获取Thread Dump

We will introduce the three most commonly used methods. Note that there are many other ways to get a thread dump. A thread dump can only show the thread status at the time of measurement, so in order to see the change in thread status, it is recommended to extract them from 5 to 10 times with 5-second intervals.

有很多种方法可以得到thread dump,我们将介绍最常见的三种方法.一个thread dump只能包含转储时刻的线程状态,所以为了分析线程状态的变化,最好每隔5秒获取一次,得到5~10个thread dump就可以了.

Getting a Thread Dump Using jstack

使用jstack

In JDK 1.6 and higher, it is possible to get a thread dump on MS Windows using jstack.

Use PID via jps to check the PID of the currently running Java application process.

如果JDK版本是1.6或以上,可以在MS Windows上使用jstack.

使用jps来产看系统上正运行的java进程的PID.

1
2
3
4
[user@linux ~]$ jps -v
25780 RemoteTestRunner -Dfile.encoding=UTF-8
25590 sub.rmi.registry.RegistryImpl 2999 -Dapplication.home=/home1/user/java/jdk.1.6.0_24 -Xms8m
26300 sun.tools.jps.Jps -mlvV -Dapplication.home=/home1/user/java/jdk.1.6.0_24 -Xms8m

Use the extracted PID as the parameter of jstack to obtain a thread dump.

把pid作为参数传给jstack,以获取该pid对应java进程的线程转储.

1
[user@linux ~]$ jstack -f 5824

A Thread Dump Using jVisualVM

使用jVisualVM

Generate a thread dump by using a program such as jVisualVM.

jVisualVM是JDK提供的可视化工具.

Alt text

Figure 2: A Thread Dump Using visualvm.

The task on the left indicates the list of currently running processes. Click on the process for which you want the information, and select the thread tab to check the thread information in real time. Click the Thread Dump button on the top right corner to get the thread dump file.

左侧列举了正在运行的java进程.单击想要分析的进程,选择thread选项卡,可以看到实时信息.单击右上角的Thread Dump按钮生成线程转储文件.

Generating in a Linux Terminal

Obtain the process pid by using ps -ef command to check the pid of the currently running Java process.

1
2
3
4
[user@linux ~]$ ps - ef | grep java
user      2477          1    0 Dec23 ?         00:10:45 ...
user    25780 25361   0 15:02 pts/3    00:00:02 ./jstatd -J -Djava.security.policy=jstatd.all.policy -p 2999
user    26335 25361   0 15:49 pts/3    00:00:00 grep java

Use the extracted pid as the parameter of kill –SIGQUIT(3) to obtain a thread dump.

Thread Information from the Thread Dump File

线程转储文件包含的线程信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
"pool-1-thread-13" prio=6 tid=0x000000000729a000 nid=0x2fb4 runnable [0x0000000007f0f000] java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
 
- locked <0x0000000780b7e688> (a java.io.InputStreamReader)
 
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
 
- locked <0x0000000780b7e688> (a java.io.InputStreamReader)
 
at java.io.BufferedReader.readLine(BufferedReader.java:362)
)
  • Thread name: When using Java.lang.Thread class to generate a thread, the thread will be named Thread-(Number), whereas when using java.util.concurrent.ThreadFactory class, it will be named pool-(number)-thread-(number).
  • Priority: Represents the priority of the threads.
  • Thread ID: Represents the unique ID for the threads. (Some useful information, including the CPU usage or memory usage of the thread, can be obtained by using thread ID.)
  • Thread status: Represents the status of the threads.
  • Thread callstack: Represents the call stack information of the threads.

  • Thread name 线程名: 使用 Java.lang.Thread类去创建线程时,线程的命名方式为Thread-(序号);当使用java.util.concurrent.ThreadFactory类去创建线程时,线程的命名方式为pool-(序号)-thread-(序号).

  • Priority 优先级: 表示线程的优先级.
  • Thread ID 线程ID: 线程的ID是唯一的(一些有用的信息,比如线程的CPU使用率或者内存使用率,可以通过线程ID找到).
  • Thread status 线程状态: 表示线程的状态.
  • Thread callstack 线程调用栈: 表示线程的调用栈信息.

Thread Dump Patterns by Type

线程转储的类型布局

When Unable to Obtain a Lock (BLOCKED)

无法获得锁的时候

This is when the overall performance of the application slows down because a thread is occupying the lock and prevents other threads from obtaining it. In the following example, BLOCKED_TEST pool-1-thread-1 thread is running with <0x0000000780a000b0> lock, while BLOCKED_TEST pool-1-thread-2 and BLOCKED_TEST pool-1-thread-3 threads are waiting to obtain <0x0000000780a000b0> lock.

此时整个应用的性能下降了,因为一个线程持有了锁而不肯释放,导致其它线程无法获得锁.在下面的例子里,线程BLOCKED_TEST pool-1-thread-1 获得了锁<0x0000000780a000b0>并且正在运行,此时线程 BLOCKED_TEST pool-1-thread-2 和线程 BLOCKED_TEST pool-1-thread-3 在等待锁<0x0000000780a000b0>.

Alt text

Figure 3: A thread blocking other threads.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
"BLOCKED_TEST pool-1-thread-1" prio=6 tid=0x0000000006904800 nid=0x28f4 runnable [0x000000000785f000]
   java.lang.Thread.State: RUNNABLE
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:282)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    - locked <0x0000000780a31778> (a java.io.BufferedOutputStream)
    at java.io.PrintStream.write(PrintStream.java:432)
    - locked <0x0000000780a04118> (a java.io.PrintStream)
    at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
    at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
    at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:85)
    - locked <0x0000000780a040c0> (a java.io.OutputStreamWriter)
    at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:168)
    at java.io.PrintStream.newLine(PrintStream.java:496)
    - locked <0x0000000780a04118> (a java.io.PrintStream)
    at java.io.PrintStream.println(PrintStream.java:687)
    - locked <0x0000000780a04118> (a java.io.PrintStream)
    at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:44)
    - locked <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
    at com.nbp.theplatform.threaddump.ThreadBlockedState$1.run(ThreadBlockedState.java:7)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
    - <0x0000000780a31758> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

"BLOCKED_TEST pool-1-thread-2" prio=6 tid=0x0000000007673800 nid=0x260c waiting for monitor entry [0x0000000008abf000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:43)
    - waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
    at com.nbp.theplatform.threaddump.ThreadBlockedState$2.run(ThreadBlockedState.java:26)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
    - <0x0000000780b0c6a0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

"BLOCKED_TEST pool-1-thread-3" prio=6 tid=0x00000000074f5800 nid=0x1994 waiting for monitor entry [0x0000000008bbf000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.nbp.theplatform.threaddump.ThreadBlockedState.monitorLock(ThreadBlockedState.java:42)
    - waiting to lock <0x0000000780a000b0> (a com.nbp.theplatform.threaddump.ThreadBlockedState)
    at com.nbp.theplatform.threaddump.ThreadBlockedState$3.run(ThreadBlockedState.java:34)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
    - <0x0000000780b0e1b8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

When in Deadlock Status

死锁的时候

This is when thread A needs to obtain thread B’s lock to continue its task, while thread B needs to obtain thread A’s lock to continue its task. In the thread dump, you can see that DEADLOCK_TEST-1 thread has 0x00000007d58f5e48 lock, and is trying to obtain 0x00000007d58f5e60 lock. You can also see that DEADLOCK_TEST-2 thread has 0x00000007d58f5e60 lock, and is trying to obtain 0x00000007d58f5e78 lock. Also, DEADLOCK_TEST-3 thread has 0x00000007d58f5e78 lock, and is trying to obtain 0x00000007d58f5e48 lock. As you can see, each thread is waiting to obtain another thread’s lock, and this status will not change until one thread discards its lock.

线程A需要获取线程B的锁,而且B的锁需要获取A的锁.在线程转储文件里,你可以看到 线程DEADLOCK_TEST-1 持有锁0x00000007d58f5e48,并试图获取 锁0x00000007d58f5e60. 你还能看到 线程DEADLOCK_TEST-2 持有 锁0x00000007d58f5e60, 并试图获取 锁0x00000007d58f5e78. 同样, 线程DEADLOCK_TEST-3 持有 锁0x00000007d58f5e78, 并试图获取 锁0x00000007d58f5e48. 和你看到的一样, 每个线程都在等着获取其它线程的锁, 这个状态保持不变, 除非有个线程放弃了它的锁.

Alt text

Figure 4: Threads in a Deadlock status.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
"DEADLOCK_TEST-1" daemon prio=6 tid=0x000000000690f800 nid=0x1820 waiting for monitor entry [0x000000000805f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
    - waiting to lock <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
    - locked <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)

   Locked ownable synchronizers:
    - None

"DEADLOCK_TEST-2" daemon prio=6 tid=0x0000000006858800 nid=0x17b8 waiting for monitor entry [0x000000000815f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
    - waiting to lock <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
    - locked <0x00000007d58f5e60> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)

   Locked ownable synchronizers:
    - None

"DEADLOCK_TEST-3" daemon prio=6 tid=0x0000000006859000 nid=0x25dc waiting for monitor entry [0x000000000825f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.goMonitorDeadlock(ThreadDeadLockState.java:197)
    - waiting to lock <0x00000007d58f5e48> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.monitorOurLock(ThreadDeadLockState.java:182)
    - locked <0x00000007d58f5e78> (a com.nbp.theplatform.threaddump.ThreadDeadLockState$Monitor)
    at com.nbp.theplatform.threaddump.ThreadDeadLockState$DeadlockThread.run(ThreadDeadLockState.java:135)

   Locked ownable synchronizers:
                - None

When Continuously Waiting to Receive Messages from a Remote Server

持续等待接收远程服务器数据的时候

The thread appears to be normal, since its state keeps showing as RUNNABLE. However, when you align the thread dumps chronologically, you can see that socketReadThread thread is waiting infinitely to read the socket.

因为线程的状态一直是 RUNNABLE ,所以它看起来很正常. 然而, 当你把多个线程转出文件按时间顺序排列, 你可以看到线程 socketReadThread 在无限等待读取socket.

Alt text

Figure 5: Continuous Waiting Status.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
"socketReadThread" prio=6 tid=0x0000000006a0d800 nid=0x1b40 runnable [0x00000000089ef000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:129)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
    - locked <0x00000007d78a2230> (a java.io.InputStreamReader)
    at sun.nio.cs.StreamDecoder.read0(StreamDecoder.java:107)
    - locked <0x00000007d78a2230> (a java.io.InputStreamReader)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:93)
    at java.io.InputStreamReader.read(InputStreamReader.java:151)
    at com.nbp.theplatform.threaddump.ThreadSocketReadState$1.run(ThreadSocketReadState.java:27)
    at java.lang.Thread.run(Thread.java:662)

When Waiting

等待的时候

The thread is maintaining WAIT status. In the thread dump, IoWaitThread thread keeps waiting to receive a message from LinkedBlockingQueue. If there continues to be no message for LinkedBlockingQueue, then the thread status will not change.

线程处于 WAIT 状态. 线程转储文件里, 线程IoWaitThread 一直等待来自 LinkedBlockingQueue 的消息. 如果 LinkedBlockingQueue 没有消息, 那么线程的状态就不会改变.

Alt text

Figure 6: Waiting status.

1
2
3
4
5
6
7
8
9
10
"IoWaitThread" prio=6 tid=0x0000000007334800 nid=0x2b3c waiting on condition [0x000000000893f000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007d5c45850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
    at java.util.concurrent.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:440)
    at java.util.concurrent.LinkedBlockingDeque.take(LinkedBlockingDeque.java:629)
    at com.nbp.theplatform.threaddump.ThreadIoWaitState$IoWaitHandler2.run(ThreadIoWaitState.java:89)
    at java.lang.Thread.run(Thread.java:662)

When Thread Resources Cannot be Organized Normally

线程资源管理失控的时候

Unnecessary threads will pile up when thread resources cannot be organized normally. If this occurs, it is recommended to monitor the thread organization process or check the conditions for thread termination.

如果不能有效地管理线程资源,那么一些不必要的线程就因为没有被销毁而越来越多. 发生这种状况的时候, 应该检查负责线程管理的线程,或者检查线程停止条件.

Alt text

Figure 7: Unorganized Threads.

How to Solve Problems by Using Thread Dump

如何用线程转储解决问题

Example 1: When the CPU Usage is Abnormally High

例 1: CPU使用率过高

  1. Extract the thread that has the highest CPU usage.

    找出哪个线程占用了最多的CPU资源.

1
2
3
4
5
[user@linux ~]$ ps -mo pid.lwp.stime.time.cpu -C java
      PID         LWP          STIME          TIME               %CPU
     10029          -         Dec07          00:02:02           99.5
         -       10039        Dec07          00:00:00            0.1
         -       10040        Dec07          00:00:00           95.5

From the application, find out which thread is using the CPU the most.

Acquire the Light Weight Process (LWP) that uses the CPU the most and convert its unique number (10039) into a hexadecimal number (0x2737).

找出应用里最占用CPU的那个线程,并获得它的轻量进程 Light Weight Process (LWP). 把这个唯一的LWP数字 (10039) 转成十六进制 (0x2737).

  1. After acquiring the thread dump, check the thread’s action.

    得到线程转储文件之后, 检查线程执行的动作.

Extract the thread dump of an application with a PID of 10029, then find the thread with an nid of 0x2737.

把PID为10029的应用进行线程转储, 然后找到nid为 0x2737 的线程.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
"NioProcessor-2" prio=10 tid=0x0a8d2800 nid=0x2737 runnable [0x49aa5000] java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
    - locked <0x74c52678> (a sun.nio.ch.Util$1)
    - locked <0x74c52668> (a java.util.Collections$UnmodifiableSet)
    - locked <0x74c501b0> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
    at external.org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:65)
    at external.org.apache.mina.common.AbstractPollingIoProcessor$Worker.run(AbstractPollingIoProcessor.java:708)
    at external.org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

Extract thread dumps several times every hour, and check the status change of the threads to determine the problem.

按一小时的时间间隔获取多个线程转储文件, 查看线程状态的变化来定位问题.

Example 2: When the Processing Performance is Abnormally Slow

例 2: 应用的效率过低

After acquiring thread dumps several times, find the list of threads with BLOCKED status.

获取多个线程转储文件之后, 找到状态为 BLOCKED 的所有线程.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 "DB-Processor-13" daemon prio=5 tid=0x003edf98 nid=0xca waiting for monitor entry [0x000000000825f000]
java.lang.Thread.State: BLOCKED (on object monitor)
    at beans.ConnectionPool.getConnection(ConnectionPool.java:102)
    - waiting to lock <0xe0375410> (a beans.ConnectionPool)
    at beans.cus.ServiceCnt.getTodayCount(ServiceCnt.java:111)
    at beans.cus.ServiceCnt.insertCount(ServiceCnt.java:43)

"DB-Processor-14" daemon prio=5 tid=0x003edf98 nid=0xca waiting for monitor entry [0x000000000825f020]
java.lang.Thread.State: BLOCKED (on object monitor)
    at beans.ConnectionPool.getConnection(ConnectionPool.java:102)
    - waiting to lock <0xe0375410> (a beans.ConnectionPool)
    at beans.cus.ServiceCnt.getTodayCount(ServiceCnt.java:111)
    at beans.cus.ServiceCnt.insertCount(ServiceCnt.java:43)

"" "DB-Processor-3"" daemon prio=5 tid=0x00928248 nid=0x8b waiting for monitor entry [0x000000000825d080]
java.lang.Thread.State: RUNNABLE
    at oracle.jdbc.driver.OracleConnection.isClosed(OracleConnection.java:570)
    - waiting to lock <0xe03ba2e0> (a oracle.jdbc.driver.OracleConnection)
    at beans.ConnectionPool.getConnection(ConnectionPool.java:112)
    - locked <0xe0386580> (a java.util.Vector)
    - locked <0xe0375410> (a beans.ConnectionPool)
    at beans.cus.Cue_1700c.GetNationList(Cue_1700c.java:66)
    at org.apache.jsp.cue_1700c_jsp._jspService(cue_1700c_jsp.java:12)

Acquire the list of threads with BLOCKED status after getting the thread dumps several times.

If the threads are BLOCKED, extract the threads related to the lock that the threads are trying to obtain.

Through the thread dump, you can confirm that the thread status stays BLOCKED because <0xe0375410> lock could not be obtained. This problem can be solved by analyzing stack trace from the thread currently holding the lock.

There are two reasons why the above pattern frequently appears in applications using DBMS. The first reason is inadequate configurations. Despite the fact that the threads are still working, they cannot show their best performance because the configurations for DBCP and the like are not adequate. If you extract thread dumps multiple times and compare them, you will often see that some of the threads that were BLOCKED previously are in a different state.

The second reason is the abnormal connection. When the connection with DBMS stays abnormal, the threads wait until the time is out. In this case, even after extracting the thread dumps several times and comparing them, you will see that the threads related to DBMS are still in a BLOCKED state. By adequately changing the values, such as the timeout value, you can shorten the time in which the problem occurs.

如果线程处于 BLOCKED 状态, 首先查找阻塞它的锁, 然后把和这个锁有关的线程找出来. 从上面的转储文件里, 你可以看到处于 BLOCKED 状态的线程, 她想获得锁 <0xe0375410>. 这个问题可以通过分析持有锁的线程的调用栈来解决. 使用DBMS的应用会频繁地出现上述问题, 原因有两个. 一是配置不完善. 尽管线程可以运行, 但是类似DBCP等选项没有被正确配置,所以达不到最高的效率. 如果获取多个线程转储文件并比较它们, 你会发现处于 BLOCKED 状态的线程之前是其它状态. 二是连接异常. 如果与DBMS的连接异常, 线程会一直等到超时. 这种情况下, 导出并比较多个转储文件, 你会发现和DBMS有关的线程仍然处于 BLOCKED 状态. 通过多次调整参数, 比如超时时间, 你可以缩短上述问题发生的时间.

Coding for Easy Thread Dump

有利于线程转储的编码方式

Naming Threads

给线程命名

When a thread is created using java.lang.Thread object, the thread will be named Thread-(Number). When a thread is created using java.util.concurrent.DefaultThreadFactory object, the thread will be named pool-(Number)-thread-(Number). When analyzing tens to thousands of threads for an application, if all the threads still have their default names, analyzing them becomes very difficult, because it is difficult to distinguish the threads to be analyzed.

使用java.lang.Thread 创建线程时, 线程被命名为 Thread-(序号). 使用java.util.concurrent.DefaultThreadFactory 创建线程, 线程被命名为 pool-(序号)-thread-(序号). 如果应用有成百上千的线程, 这些线程都使用默认的命名, 分析这些线程就变得困难, 因为区分这些线程不是一件容易的事.

Therefore, you are recommended to develop the habit of naming the threads whenever a new thread is created.

所以,要养成给线程命名的好习惯.

When you create a thread using java.lang.Thread, you can give the thread a custom name by using the creator parameter.

使用 java.lang.Thread时, 通过构造函数给线程一个定制的名字.

1
2
3
4
public Thread(Runnable target, String name);
public Thread(ThreadGroup group, String name);
public Thread(ThreadGroup group, Runnable target, String name);
public Thread(ThreadGroup group, Runnable target, String name, long stackSize);

When you create a thread using java.util.concurrent.ThreadFactory, you can name it by generating your own ThreadFactory. If you do not need special functionalities, then you can use MyThreadFactory as described below:

使用 java.util.concurrent.ThreadFactory时, 可以自己实现一个ThreadFactory. 如果没有特殊的需求, 下面的例子就足够了.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.atomic.AtomicInteger;
 
public class MyThreadFactory implements ThreadFactory {
  private static final ConcurrentHashMap<String, AtomicInteger> POOL_NUMBER =
                                                       new ConcurrentHashMap<String, AtomicInteger>();
  private final ThreadGroup group;
  private final AtomicInteger threadNumber = new AtomicInteger(1);
  private final String namePrefix;
  
  public MyThreadFactory(String threadPoolName) {
       
      if (threadPoolName == null) {
          throw new NullPointerException("threadPoolName");
      }
        
      POOL_NUMBER.putIfAbsent(threadPoolName, new AtomicInteger());
       
      SecurityManager securityManager = System.getSecurityManager();
      group = (securityManager != null) ? securityManager.getThreadGroup() :
                                                    Thread.currentThread().getThreadGroup();
       
      AtomicInteger poolCount = POOL_NUMBER.get(threadPoolName);
 
      if (poolCount == null) {
            namePrefix = threadPoolName + " pool-00-thread-";
      } else {
            namePrefix = threadPoolName + " pool-" + poolCount.getAndIncrement() + "-thread-";
      }
  }
  
  public Thread newThread(Runnable runnable) {
      Thread thread = new Thread(group, runnable, namePrefix + threadNumber.getAndIncrement(), 0);
 
      if (thread.isDaemon()) {
            thread.setDaemon(false);
      }
 
      if (thread.getPriority() != Thread.NORM_PRIORITY) {
            thread.setPriority(Thread.NORM_PRIORITY);
      }
 
      return thread;
  }
}

Obtaining More Detailed Information by Using MBean

通过MBean获得更详细的信息

You can obtain ThreadInfo objects using MBean. You can also obtain more information that would be difficult to acquire via thread dumps, by using ThreadInfo.

你可以使用MBean获得ThreadInfo对象. 你也可以使用ThreadInfo来获得一些信息, 而线程转储的方式很难获取这些信息.

1
2
3
4
5
6
7
8
9
10
11
ThreadMXBean mxBean = ManagementFactory.getThreadMXBean();
long[] threadIds = mxBean.getAllThreadIds();
ThreadInfo[] threadInfos = mxBean.getThreadInfo(threadIds);
 
for (ThreadInfo threadInfo : threadInfos) {
  System.out.println(threadInfo.getThreadName());
  System.out.println(threadInfo.getBlockedCount());
  System.out.println(threadInfo.getBlockedTime());
  System.out.println(threadInfo.getWaitedCount());
  System.out.println(threadInfo.getWaitedTime());
} 

You can acquire the amount of time that the threads WAITed or were BLOCKED by using the method in ThreadInfo, and by using this you can also obtain the list of threads that have been inactive for an abnormally long period of time.

你可以方便地得到线程处于 WAIT 和 BLOCKED 状态的时间, 由此你可以获得那些长期处于不活动状态的线程列表.

In Conclusion

结论

In this article I was concerned that for developers with a lot of experience in multi-thread programming, this material may be common knowledge, whereas for less experienced developers, I felt that I was skipping straight to thread dumps, without providing enough background information about the thread activities. This was because of my lack of knowledge, as I was not able to explain the thread activities in a clear yet concise manner. I sincerely hope that this article will prove helpful for many developers.

这篇文章总结了处理多线程的常识性的经验, 对这方面有很多经验的人来说可能帮助不大. 对新人来说, 我感觉我在开门见山地讲线程转储, 而略过了线程活动的背景知识. 这是因为我缺少这方面的经验, 不能简单扼要地进行阐述. 我真诚地希望这篇文章能够帮助更多的人.

AngularJs的初次体验

| Comments

用angularjs来实现个啥东西?
  • 右边为列表的展示
  • 左边search栏为输入筛选的字段,对应的列表展示筛选的结果。
  • 左边Sort by 栏为选择排序值,对应的列表展示排序结果。
  • Reverse Searchd的值是search input框值得逆序。(可以通过穿件directive或者filter)。
  • 点击莫个链接,改变hash值来切换模块,展现不同的页面。(利用n-view或者ng-include指令)。
需要用到的指令以及筛选filter
  1. ng-app
  2. ng-repeat
  3. ng-model
  4. ng-view
  5. filter
  6. orderBy
  7. 自定义指令ngreverse
代码示例
1
2
City:<input ng-model="city"/>
City reverse:<span ngreverse="city" style="color:red;"></span>
1
2
3
4
5
6
7
8
appModule.directive('ngreverse', function() {
      return function(scope, element, attrs){
          scope.$watch(attrs.ngreverse, function(value) {
              value = value == undefined ? "" : value;
              element.text(reverse(value));
          });
      };
  });
代码实现细节
  • $routeProvider
1
2
3
4
5
6
7
8
9
10
11
var appModule = angular.module('phonecat', []);
appModule.config(
  ['$routeProvider', function($routeProvider) {
      $routeProvider.when('/', {
          templateUrl: 'phone-list.html',
          controller: PhoneListCtrl
      }).when('/:phoneId/:phoneAge', {
          templateUrl: 'phone-detail.html',
          controller: PhoneDetailCtrl
      }).otherwise({redirectTo: '/'});
  }]);
实现效果

images

Maven-release-plugin 实践

| Comments

maven-release-plugin简介

该插件是maven自带的用于发布项目之用,比如我们用SCM的svn来管理源代码,一般会分为trunk/branches/tags三个目录。 trunk对应主线开发,一般对应的是SNAPSHOT版本,而branches可以是从trunk copy to的一个修复的小版本,也可以是从 tags copy to的一个要修复的版本,同样对应的是SNAPSHOT版本。仅有tags下面的项目的版本号定义为release。 至于maven下的release与snapshot的区别,不清楚的同学可以猛击:http://www.mzone.cc/article/277.html

官网地址: http://maven.apache.org/maven-release/maven-release-plugin/

如何用好maven-release-plugin

在实际开发中,为了方便修复bug,以及准备测试程序包。按照规范流程都需要将开发程序打包成tag,然后将程序发布出去。 那么我就经常需要从 1.0-SNAPSHOT到1.0到1.1-SNAPSHOT 这样的操作。对于项目个数比较少的情况,手动修改也未尝不可。 对于比较复杂的系统,分成了大量的服务,业务模块的,少则7,8个多者20+个。笔者在实际应用中的就是包含有20多个工程。 在没有应用release插件的时候,基本上负责发布的同学很头疼,也想办法用shell脚本去改pom的版本号。但是还是稍显麻烦。

1.0-SNAPSHOT到1.0到1.1-SNAPSHOT

SNAPSHOT是快照的意思,项目到一个阶段后,就需要发布一个正式的版本(release版本)。一次正式的发布需要这样一些工作:

在trunk中,更新pom版本从1.0-SNAPSHOT到1.0
对1.0打一个svn tag
针对tag进行mvn deploy,发布正式版本
更新trunk从1.0到1.1-SNAPSHOT

SCM

首先我们需要在POM中加入scm信息,这样Maven才能够替你完成svn操作,示例配置如下:

1
2
3
4
5
6
<scm>
  <connection>scm:svn:http://svn地址前缀部分/myapp/trunk/</connection>
  <developerConnection>scm:svn:http://svn地址前缀部分/myapp/trunk/</developerConnection>
  <url>scm:svn:http://svn地址前缀部分/myapp/trunk/</url>
</scm>
这样的配置是对应trunk下pom的配置,各个branches和tags都不一样,需要是其对于的svn地址

maven-release-plugin 应用配置

紧接着,我们需要配置maven-release-plugin,这个插件会帮助我们升级pom版本,提交,打tag,然后再升级版本,再提交,等等。配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-release-plugin</artifactId>
  <version>2.4.1</version>
  <configuration>
      <!-- mvn release:branch -DbranchName=xxx -DupdateBranchVersions=true
                      -DupdateWorkingCopyVersion=false -->
      <branchBase>http://svn地址前缀部分/myapp/branches</branchBase>
      <arguments>
          -Dmaven.test.skip=true
      </arguments>

      <!-- mvn release:perform -DautoVersionSubmodules=true -DupdateWorkingCopyVersion=false -->
      <tagBase>http://svn地址前缀部分/myapp/tags</tagBase>
      <waitBeforeTagging>10</waitBeforeTagging>
      <username>${svn.username}</username>
      <password>${svn.password}</password>
      <mavenHome>${svn.maven.home}</mavenHome>
      <scmCommentPrefix>
issue:maven-release-plugin
msg:excute maven-release-plugin
      </scmCommentPrefix>
      <autoVersionSubmodules>true</autoVersionSubmodules>
  </configuration>
</plugin>

从上面的配置可以知道,需要配置svn提交的用户名和密码以及svn安装目录,还有就是svn提交的注释(可选)。 另外在release插件打包发布到似有远程仓库的部分需要配置:

1
2
3
4
5
6
7
8
9
10
11
12
<distributionManagement>
  <repository>
      <id>nexus</id>
      <name>Nexus</name>
      <url>http://ci仓库域名/nexus/content/repositories/releases</url>
  </repository>
  <snapshotRepository>
      <id>nexus</id>
      <name>Nexus</name>
      <url>http://ci仓库域名/nexus/content/repositories/snapshots</url>
  </snapshotRepository>
</distributionManagement>

执行、操作 Action

mvn release:prepare

执行过程中,你会遇到这样的提示:

What is the release version for “Unnamed - org.myorg:myapp:jar:1.0-SNAPSHOT”? (org.myorg:myapp) 1.0: :

——“你想将1.0-SNAPSHOT发布为什么版本?默认是1.0。”我要的就是1.0,直接回车。

What is SCM release tag or label for “Unnamed - org.myorg:myapp:jar:1.0-SNAPSHOT”? (org.myorg:myapp) myapp-1.0: :

——“发布的tag标签名称是什么?默认为myapp-1.0。”我还是要默认值,直接回车。

What is the new development version for “Unnamed - org.myorg:myapp:jar:1.0-SNAPSHOT”? (org.myorg:myapp) 1.1-SNAPSHOT: :

——“主干上新的版本是什么?默认为1.1-SNAPSHOT。”哈,release插件会自动帮我更新版本到1.1-SNAPSHOT,很好,直接回车。

然后屏幕刷阿刷,maven在build我们的项目,并进行了一些svn操作,你可以仔细查看下日志。

那么结果是什么呢?你可以浏览下svn仓库:

我们多了一个tag:https://svn-address.com/myapp/tags/myapp-1.0/ 这就是需要发布的版本1.0。 再看看trunk中的POM,其版本自动升级成了1.1-SNAPSHOT。

这不正是我们想要的么?等等,好像缺了点什么,对了,1.0还没有发布到仓库中呢。

再一次屏住呼吸,执行:

mvn release:perform

maven-release-plugin会自动帮我们签出刚才打的tag,然后打包,分发到远程Maven仓库中,至此,整个版本的升级,打标签,发布等工作全部完成。我们可以在远程Maven仓库中看到正式发布的1.0版本。

这可是自动化的 ,正式的 版本发布!

注意点

svn client的版本号,1.6的版本执行的时候需要手动确认版本号, 1.7不需要。

maven项目中的依赖其他的jar包,不能是SNAPSHOT版本的。(同时开发的项目可以用module的形式引入)。

从trunk打branches、tags以及从tags打branches、或者从branches打tags以上命令都支持,一般执行release:perform都需要先执行release:prepare.

修改trunk的版本号或者branches的版本号,可以用release:update-versions命令。

Wiremock在maven环境下的应用

| Comments

wiremock是什么

wiremock是一个用来做Web服务存根和mock的灵活工具库。与常用的mock工具不一样的是,wiremock通过创建一个真实的http服务,让你的代码在测试上可以连接到真正的Web服务

它支持HTTP响应存根,请求验证,代理/拦截,记录/回放的存根和故障注入,可以从内部使用单元测试或部署到测试环境。

尽管是用java编写的,有一套Json api可以与其他语言完美结合使用。

官方地址:http://wiremock.org/

解决了什么问题

在大前端的推动下,不少web系统采用前端和后端分离架构,前端只需要调用restful服务API即可拿到交互数据。在约定好api接口后,web开发团和API服务团队可以并行开发。 可以让各司其职,各自处理自己擅长的部分,让事情做起来更高效。

maven support

1. 需要将前端代码独立成一个web maven项目。

需要按wiremock的方式放置文件目录:需要保包含__filesmapping目录。当wiremock以文件的方式进行独立运行时, 需要将文件放置在__files目录下用来作为文档根目录。而mapping目录作为映射请求url的json数据存储目录以及定义url和json数据的映射关系。

目录组织方式如下图

images

pom.xml的配置片段

images

2. 将整个应用程序集成的一个web maven项目。

images

Github上安装octopress博客

| Comments

周末闲来无事,学习了一下怎样利用github pages来建立自己的博客,现在比较成熟的博客系统是Octopress,关于其详细的介绍可以参见官方文档。 本文安装是基于window xp 32系统,参考地址:http://jinlong.github.io/blog/2013/03/15/deploy-github-pages-using-octopress-on-windows/

  1. 已拥有github账号 没有的同学可以去 https://github.com 上申请。
  2. 需要在window机器上安装Git 安装好Git后,安装目录下提供了一个叫Git Bash的cmd工具
  3. 安装ruby环境 rvm在window下没法安装,以及替代安装工具Pik的安装也需要依赖rubygems工具。所以最后选择RubyInstaller安装程序,一键安装。 安装好后,需要将其配置在环境变量->系统变量的Path中
1
*.;C:\Ruby193\bin;C:\Program Files\Git\bin;C:\Program Files\Git\cmd

安装DevKit,ruby的一个开发工具集 更新配置 中文 utf-8 编码的支持,在win7环境变量中配置如下:

1
2
LANG=zh_CN.UTF-8
LC_ALL=zh_CN.UTF-8

变更 gem 的更新源,变更如下:

1
2
3
gem sources --remove http://rubygems.org/
gem sources -a http://ruby.taobao.org/
gem sources -l

进入Git Bash cmd,执行如下命令安装bundler

1
gem install bundler

安装Octopress 下载Octopress源代码

1
2
3
git clone git://github.com/imathis/octopress.git octopress
cd octopress # If you use RVM, You'll be asked if you trust the .rvmrc file (say yes).
ruby --version # Should report Ruby 1.9.2

安装依赖模块

1
2
3
4
5
cd octopress
vi GemFile
将行 : source "http://rubygems.org/"
改为 : source "http://ruby.taobao.org/"
$ bundle install

安装默认主题

1
rake install

发布到github上

1
rake setup_github_pages # 会提示要输入对应的github地址,例如 https://github.com/yangtao309/yangtao309.github.com.git

生成博客系统以及预览

1
2
rake generate # 生成文件
rake preview  # 预览系统,默认访问地址 http://127.0.0.1:4000

最后就是提交代码到github

1
rake deploy # 会提示输入github的账号和密码信息

简单的octopress搭建就算完成了。后面就是些blog内容和安装分享插件bshare和微博右侧栏、以及新的主题替换啦。