Category Archives: JAVA

Understanding transaction pitfalls

The most common reason for using transactions in an application is to maintain a high degree of data integrity and consistency. If you’re unconcerned about the quality of your data, you needn’t concern yourself with transactions. After all, transaction support in the Java platform can kill performance, introduce locking issues and database concurrency problems, and add complexity to your application.

But developers who don’t concern themselves with transactions do so at their own peril. Almost all business-related applications require a high degree of data quality. The financial investment industry alone wastes tens of billions of dollars on failed trades, with bad data being the second-leading cause. Although lack of transaction support is only one factor leading to bad data (albeit a major one), a safe inference is that billions of dollars are wasted in the financial investment industry alone as a result of nonexistent or poor transaction support.

Ignorance about transaction support is another source of problems. All too often I hear claims like “we don’t need transaction support in our applications because they never fail.” Right. I have witnessed some applications that in fact rarely or never throw exceptions. These applications bank on well-written code, well-written validation routines, and full testing and code coverage support to avoid the performance costs and complexity associated with transaction processing. The problem with this type of thinking is that it takes into account only one characteristic of transaction support: atomicity. Atomicity ensures that all updates are treated as a single unit and are either all committed or all rolled back. But rolling back or coordinating updates isn’t the only aspect of transaction support. Another aspect, isolation, ensures that one unit of work is isolated from other units of work. Without proper transaction isolation, other units of work can access updates made by an ongoing unit of work, even though that unit of work is incomplete. As a result, business decisions might be made on the basis of partial data, which could cause failed trades or other negative (or costly) outcomes.

So, given the high cost and negative impact of bad data and the basic knowledge that transactions are important (and necessary), you need to use transactions and learn how to deal with the issues that can arise. You press on and add transaction support to your applications. And that’s where the problem often begins. Transactions don’t always seem to work as promised in the Java platform. This article is an exploration of the reasons why. With the help of code examples, I’ll introduce some of the common transaction pitfalls I continually see and experience in the field, in most cases in production environments.

Although most of this article’s code examples use the Spring Framework (version 2.5), the transaction concepts are the same as for the EJB 3.0 specification. In most cases, it is simply a matter of replacing the Spring Framework @Transactional annotation with the @TransactionAttribute annotation found in the EJB 3.0 specification. Where the two frameworks differ in concept and technique, I have included both Spring Framework and EJB 3.0 source code examples.

Local transaction pitfalls

A good place to start is with the easiest scenario: the use of local transactions, also commonly referred to as database transactions. In the early days of database persistence (for example, JDBC), we commonly delegated transaction processing to the database. After all, isn’t that what the database is supposed to do? Local transactions work fine for logical units of work (LUW) that perform a single insert, update, or delete statement. For example, consider the simple JDBC code in Listing 1, which performs an insert of a stock-trade order to a TRADE table:

Listing 1. Simple database insert using JDBC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
@Stateless
public class TradingServiceImpl implements TradingService {
   @Resource SessionContext ctx;
   @Resource(mappedName="java:jdbc/tradingDS") DataSource ds;
   public long insertTrade(TradeData trade) throws Exception {
      Connection dbConnection = ds.getConnection();
      try {
         Statement sql = dbConnection.createStatement();
         String stmt =
            "INSERT INTO TRADE (ACCT_ID, SIDE, SYMBOL, SHARES, PRICE, STATE)"
          + "VALUES ("
          + trade.getAcct() + "','"
          + trade.getAction() + "','"
          + trade.getSymbol() + "',"
          + trade.getShares() + ","
          + trade.getPrice() + ",'"
          + trade.getState() + "')";
         sql.executeUpdate(stmt, Statement.RETURN_GENERATED_KEYS);
         ResultSet rs = sql.getGeneratedKeys();
         if (rs.next()) {
            return rs.getBigDecimal(1).longValue();
         } else {
            throw new Exception("Trade Order Insert Failed");
         }
      } finally {
         if (dbConnection != null) dbConnection.close();
      }
   }
}

The JDBC code in Listing 1 includes no transaction logic, yet it persists the trade order in the TRADE table in the database. In this case, the database handles the transaction logic.

This is all well and good for a single database maintenance action in the LUW. But suppose you need to update the account balance at the same time you insert the trade order into the database, as shown in Listing 2:

Listing 2. Performing multiple table updates in the same method
1
2
3
4
5
6
7
8
9
10
public TradeData placeTrade(TradeData trade) throws Exception {
   try {
      insertTrade(trade);
      updateAcct(trade);
      return trade;
   } catch (Exception up) {
      //log the error
      throw up;
   }
}

In this case, the insertTrade() and updateAcct() methods use standard JDBC code without transactions. Once the insertTrade() method ends, the database has persisted (and committed) the trade order. If the updateAcct() method should fail for any reason, the trade order would remain in the TRADE table at the end of the placeTrade() method, resulting in inconsistent data in the database. If the placeTrade() method had used transactions, both of these activities would have been included in a single LUW, and the trade order would have been rolled back if the account update failed.

With the popularity of Java persistence frameworks like Hibernate, TopLink, and the Java Persistence API (JPA) on the rise, we rarely write straight JDBC code anymore. More commonly, we use the newer object-relational mapping (ORM) frameworks to make our lives easier by replacing all of that nasty JDBC code with a few simple method calls. For example, to insert the trade order from the JDBC code example in Listing 1, using the Spring Framework with JPA, you’d map the TradeData object to the TRADE table and replace all of that JDBC code with the JPA code in Listing 3:

Listing 3. Simple insert using JPA
1
2
3
4
5
6
7
8
public class TradingServiceImpl {
    @PersistenceContext(unitName="trading") EntityManager em;
    public long insertTrade(TradeData trade) throws Exception {
       em.persist(trade);
       return trade.getTradeId();
    }
}

Notice that Listing 3 invokes the persist() method on the EntityManager to insert the trade order. Simple, right? Not really. This code will not insert the trade order into the TRADE table as expected, nor will it throw an exception. It will simply return a value of 0 as the key to the trade order without changing the database. This is one of the first major pitfalls of transaction processing: ORM-based frameworks require a transaction in order to trigger the synchronization between the object cache and the database. It is through a transaction commit that the SQL code is generated and the database affected by the desired action (that is, insert, update, delete). Without a transaction there is no trigger for the ORM to generate SQL code and persist the changes, so the method simply ends — no exceptions, no updates. If you are using an ORM-based framework, you must use transactions. You can no longer rely on the database to manage the connections and commit the work.

These simple examples should make it clear that transactions are necessary in order to maintain data integrity and consistency. But they only begin to scratch the surface of the complexity and pitfalls associated with implementing transactions in the Java platform.

Spring Framework @Transactional annotation pitfalls

So, you test the code in Listing 3 and discover that the persist() method didn’t work without a transaction. As a result, you view a few links from a simple Internet search and find that with the Spring Framework, you need to use the @Transactional annotation. So you add the annotation to your code as shown in Listing 4:

Listing 4. Using the @Transactional annotation
1
2
3
4
5
6
7
8
9
public class TradingServiceImpl {
   @PersistenceContext(unitName="trading") EntityManager em;
   @Transactional
   public long insertTrade(TradeData trade) throws Exception {
      em.persist(trade);
      return trade.getTradeId();
   }
}

You retest your code, and you find it still doesn’t work. The problem is that you must tell the Spring Framework that you are using annotations for your transaction management. Unless you are doing full unit testing, this pitfall is sometimes hard to discover. It usually leads to developers simply adding the transaction logic in the Spring configuration files rather than through annotations.

When using the @Transactional annotation in Spring, you must add the following line to your Spring configuration file:

1
<tx:annotation-driven transaction-manager="transactionManager"/>

The transaction-manager property holds a reference to the transaction manager bean defined in the Spring configuration file. This code tells Spring to use the @Transaction annotation when applying the transaction interceptor. Without it, the @Transactional annotation is ignored, resulting in no transaction being used in your code.

Getting the basic @Transactional annotation to work in the code in Listing 4 is only the beginning. Notice that Listing 4 uses the @Transactional annotation without specifying any additional annotation parameters. I’ve found that many developers use the @Transactional annotation without taking the time to understand fully what it does. For example, when using the @Transactional annotation by itself as I do in Listing 4, what is the transaction propagation mode set to? What is the read-only flag set to? What is the transaction isolation level set to? More important, when should the transaction roll back the work? Understanding how this annotation is used is important to ensuring that you have the proper level of transaction support in your application. To answer the questions I’ve just asked: when using the @Transactional annotation by itself without any parameters, the propagation mode is set to REQUIRED, the read-only flag is set to false, the transaction isolation level is set to the database default (usually READ_COMMITTED), and the transaction will not roll back on a checked exception.

@Transactional read-only flag pitfalls

A common pitfall I frequently come across in my travels is the improper use of the read-only flag on the Spring @Transactional annotation. Here is a quick quiz for you: When using standard JDBC code for Java persistence, what does the @Transactional annotation in Listing 5 do when the read-only flag is set to true and the propagation mode set to SUPPORTS?

Listing 5. Using read-only with SUPPORTS propagation mode — JDBC
1
2
3
4
@Transactional(readOnly = true, propagation=Propagation.SUPPORTS)
public long insertTrade(TradeData trade) throws Exception {
   //JDBC Code...
}

When the insertTrade() method in Listing 5 executes, does it:

  • Throw a read-only connection exception
  • Correctly insert the trade order and commit the data
  • Do nothing because the propagation level is set to SUPPORTS

Give up? The correct answer is B. The trade order is correctly inserted into the database, even though the read-only flag is set to true and the transaction propagation set to SUPPORTS. But how can that be? No transaction is started because of the SUPPORTS propagation mode, so the method effectively uses a local (database) transaction. The read-only flag is applied only if a transaction is started. In this case, no transaction was started, so the read-only flag is ignored.

Okay, so if that is the case, what does the @Transactional annotation do in Listing 6 when the read-only flag is set and the propagation mode is set to REQUIRED?

Listing 6. Using read-only with REQUIRED propagation mode — JDBC
1
2
3
4
@Transactional(readOnly = true, propagation=Propagation.REQUIRED)
public long insertTrade(TradeData trade) throws Exception {
   //JDBC code...
}

When executed, does the insertTrade() method in Listing 6:

  • Throw a read-only connection exception
  • Correctly insert the trade order and commit the data
  • Do nothing because the read-only flag is set to true

This one should be easy to answer given the prior explanation. The correct answer here is A. An exception will be thrown, indicating that you are trying to perform an update operation on a read-only connection. Because a transaction is started (REQUIRED), the connection is set to read-only. Sure enough, when you try to execute the SQL statement, you get an exception telling you that the connection is a read-only connection.

The odd thing about the read-only flag is that you need to start a transaction in order to use it. Why would you need a transaction if you are only reading data? The answer is that you don’t. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started.

What about when you use an ORM-based framework? In keeping with the quiz format, can you guess what the result of the @Transactional annotation in Listing 7 would be if the insertTrade() method were invoked using JPA with Hibernate?

Listing 7. Using read-only with REQUIRED propagation mode — JPA
1
2
3
4
5
@Transactional(readOnly = true, propagation=Propagation.REQUIRED)
public long insertTrade(TradeData trade) throws Exception {
   em.persist(trade);
   return trade.getTradeId();
}

Does the insertTrade() method in Listing 7:

  • Throw a read-only connection exception
  • Correctly insert the trade order and commit the data
  • Do nothing because the readOnly flag is set to true

The answer to this question is a bit more tricky. In some cases the answer is C, but in most cases (particularly when using JPA) the answer is B. The trade order is correctly inserted into the database without error. Wait a minute — the preceding example shows that a read-only connection exception would be thrown when the REQUIRED propagation mode is used. That is true when you use JDBC. However, when you use an ORM-based framework, the read-only flag works a bit differently. When you are generating a key on an insert, the ORM framework will go to the database to obtain the key and subsequently perform the insert. For some vendors, such as Hibernate, the flush mode will be set to MANUAL, and no insert will occur for inserts with non-generated keys. The same holds true for updates. However, other vendors, like TopLink, will always perform inserts and updates when the read-only flag is set to true. Although this is both vendor and version specific, the point here is that you cannot be guaranteed that the insert or update will not occur when the read-only flag is set, particularly when using JPA as it is vendor-agnostic.

Which brings me to another major pitfall I frequently encounter. Given all you’ve read so far, what do you suppose the code in Listing 8 would do if you only set the read-only flag on the @Transactional annotation?

Listing 8. Using read-only — JPA
1
2
3
4
@Transactional(readOnly = true)
public TradeData getTrade(long tradeId) throws Exception {
   return em.find(TradeData.class, tradeId);
}

Does the getTrade() method in Listing 8:

  • Start a transaction, get the trade order, then commit the transaction
  • Get the trade order without starting a transaction

The correct answer here is A. A transaction is started and committed. Don’t forget: the default propagation mode for the @Transactional annotation is REQUIRED. This means that a transaction is started when in fact one is not required (see Never say never). . Depending on the database you are using, this can cause unnecessary shared locks, resulting in possible deadlock situations in the database. In addition, unnecessary processing time and resources are being consumed starting and stopping the transaction. The bottom line is that when you use an ORM-based framework, the read-only flag is quite useless and in most cases is ignored. But if you still insist on using it, always set the propagation mode to SUPPORTS, as shown in Listing 9, so no transaction is started:

Listing 9. Using read-only and SUPPORTS propagation mode for select operation
1
2
3
4
@Transactional(readOnly = true, propagation=Propagation.SUPPORTS)
public TradeData getTrade(long tradeId) throws Exception {
   return em.find(TradeData.class, tradeId);
}

Better yet, just avoid using the @Transactional annotation altogether when doing read operations, as shown in Listing 10:

Listing 10. Removing the @Transactional annotation for select operations
1
2
3
public TradeData getTrade(long tradeId) throws Exception {
   return em.find(TradeData.class, tradeId);
}

REQUIRES_NEW transaction attribute pitfalls

Whether you’re using the Spring Framework or EJB, use of the REQUIRES_NEW transaction attribute can have negative results and lead to corrupt and inconsistent data. The REQUIRES_NEW transaction attribute always starts a new transaction when the method is started, whether or not an existing transaction is present. Many developers use the REQUIRES_NEW attribute incorrectly, assuming it is the correct way to make sure that a transaction is started. Consider the two methods in Listing 11:

Listing 11. Using the REQUIRES_NEW transaction attribute
1
2
3
4
5
@Transactional(propagation=Propagation.REQUIRES_NEW)
public long insertTrade(TradeData trade) throws Exception {...}
@Transactional(propagation=Propagation.REQUIRES_NEW)
public void updateAcct(TradeData trade) throws Exception {...}

Notice in Listing 11 that both of these methods are public, implying that they can be invoked independently from each other. Problems occur with the REQUIRES_NEW attribute when methods using it are invoked within the same logical unit of work via inter-service communication or through orchestration. For example, suppose in Listing 11 that you can invoke the updateAcct() method independently of any other method in some use cases, but there’s also the case where the updateAcct() method is also invoked in the insertTrade() method. Now, if an exception occurs after the updateAcct() method call, the trade order would be rolled back, but the account updates would be committed to the database, as shown in Listing 12:

Listing 12. Multiple updates using the REQUIRES_NEW transaction attribute
1
2
3
4
5
6
7
@Transactional(propagation=Propagation.REQUIRES_NEW)
public long insertTrade(TradeData trade) throws Exception {
   em.persist(trade);
   updateAcct(trade);
   //exception occurs here! Trade rolled back but account update is not!
   ...
}

This happens because a new transaction is started in the updateAcct() method, so that transaction commits once the updateAcct() method ends. When you use the REQUIRES_NEW transaction attribute, if an existing transaction context is present, the current transaction is suspended and a new transaction started. Once that method ends, the new transaction commits and the original transaction resumes.

Because of this behavior, the REQUIRES_NEW transaction attribute should be used only if the database action in the method being invoked needs to be saved to the database regardless of the outcome of the overlaying transaction. For example, suppose that every stock trade that was attempted had to be recorded in an audit database. This information needs to be persisted whether or not the trade failed because of validation errors, insufficient funds, or some other reason. If you did not use the REQUIRES_NEW attribute on the audit method, the audit record would be rolled back along with the attempted trade. Using the REQUIRES_NEW attribute guarantees that the audit data is saved regardless of the initial transaction’s outcome. The main point here is always to use either the MANDATORY or REQUIRED attribute instead of REQUIRES_NEW unless you have a reason to use it for reasons similar those to the audit example.

Transaction rollback pitfalls

I’ve saved the most common transaction pitfall for last. Unfortunately, I see this one in production code more times than not. I’ll start with the Spring Framework and then move on to EJB 3.

So far, the code you have been looking at looks something like Listing 13:

Listing 13. No rollback support
1
2
3
4
5
6
7
8
9
10
11
@Transactional(propagation=Propagation.REQUIRED)
public TradeData placeTrade(TradeData trade) throws Exception {
   try {
      insertTrade(trade);
      updateAcct(trade);
      return trade;
   } catch (Exception up) {
      //log the error
      throw up;
   }
}

Suppose the account does not have enough funds to purchase the stock in question or is not set up to purchase or sell stock yet and throws a checked exception (for example, FundsNotAvailableException). Does the trade order get persisted in the database or is the entire logical unit of work rolled back? The answer, surprisingly, is that upon a checked exception (either in the Spring Framework or EJB), the transaction commits any work that has not yet been committed. Using Listing 13, this means that if a checked exception occurs during the updateAcct() method, the trade order is persisted, but the account isn’t updated to reflect the trade.

This is perhaps the primary data-integrity and consistency issue when transactions are used. Run-time exceptions (that is, unchecked exceptions) automatically force the entire logical unit of work to roll back, but checked exceptions do not. Therefore, the code in Listing 13 is useless from a transaction standpoint; although it appears that it uses transactions to maintain atomicity and consistency, in fact it does not.

Although this sort of behavior may seem strange, transactions behave this way for some good reasons. First of all, not all checked exceptions are bad; they might be used for event notification or to redirect processing based on certain conditions. But more to the point, the application code may be able to take corrective action on some types of checked exceptions, thereby allowing the transaction to complete. For example, consider the scenario in which you are writing the code for an online book retailer. To complete the book order, you need to send an e-mail confirmation as part of the order process. If the e-mail server is down, you would send some sort of SMTP checked exception indicating that the message cannot be sent. If checked exceptions caused an automatic rollback, the entire book order would be rolled back just because the e-mail server was down. By not automatically rolling back on checked exceptions, you can catch that exception and perform some sort of corrective action (such as sending the message to a pending queue) and commit the rest of the order.

When you use the Declarative transaction model (described in more detail in Part 2 of this series), you must specify how the container or framework should handle checked exceptions. In the Spring Framework you specify this through the rollbackFor parameter in the @Transactional annotation, as shown in Listing 14:

Listing 14. Adding transaction rollback support — Spring
1
2
3
4
5
6
7
8
9
10
11
@Transactional(propagation=Propagation.REQUIRED, rollbackFor=Exception.class)
public TradeData placeTrade(TradeData trade) throws Exception {
   try {
      insertTrade(trade);
      updateAcct(trade);
      return trade;
   } catch (Exception up) {
      //log the error
      throw up;
   }
}

Notice the use of the rollbackFor parameter in the @Transactional annotation. This parameter accepts either a single exception class or an array of exception classes, or you can use the rollbackForClassName parameter to specify the names of the exceptions as Java String types. You can also use the negative version of this property (noRollbackFor) to specify that all exceptions should force a rollback except certain ones. Typically most developers specify Exception.class as the value, indicating that all exceptions in this method should force a rollback.

EJBs work a little bit differently from the Spring Framework with regard to rolling back a transaction. The @TransactionAttribute annotation found in the EJB 3.0 specification does not include directives to specify the rollback behavior. Rather, you must use the SessionContext.setRollbackOnly() method to mark the transaction for rollback, as illustrated in Listing 15:

Listing 15. Adding transaction rollback support — EJB
1
2
3
4
5
6
7
8
9
10
11
12
@TransactionAttribute(TransactionAttributeType.REQUIRED)
public TradeData placeTrade(TradeData trade) throws Exception {
   try {
      insertTrade(trade);
      updateAcct(trade);
      return trade;
   } catch (Exception up) {
      //log the error
      sessionCtx.setRollbackOnly();
      throw up;
   }
}

Once the setRollbackOnly() method is invoked, you cannot change your mind; the only possible outcome is to roll back the transaction upon completion of the method that started the transaction. The transaction strategies described in future articles in the series will provide guidance on when and where to use the rollback directives and on when to use the REQUIRED vs. MANDATORY transaction attributes.

Conclusion

The code used to implement transactions in the Java platform is not overly complex; however, how you use and configure it can get somewhat complex. Many pitfalls are associated with implementing transaction support in the Java platform (including some less common ones that I haven’t discussed here). The biggest issue with most of them is that no compiler warnings or run-time errors tell you that the transaction implementation is incorrect. Furthermore, contrary to the assumption reflected in the “Better late than never” anecdote at the start of this article, implementing transaction support is not only a coding exercise. A significant amount of design effort goes into developing an overall transaction strategy. The rest of the Transaction strategies series will help guide you in terms of how to design an effective transaction strategy for use cases ranging from simple applications to high-performance transaction processing.


Downloadable resources

from:http://www.ibm.com/developerworks/java/library/j-ts1/index.html

refer:Spring @Transactional – isolation, propagationrefer

java面试题

关于Java并发编程的总结和思考

为什么需要并发

并发其实是一种解耦合的策略,它帮助我们把做什么(目标)和什么时候做(时机)分开。这样做可以明显改进应用程序的吞吐量(获得更多的CPU调度时间)和结构(程序有多个部分在协同工作)。做过Java Web开发的人都知道,Java Web中的Servlet程序在Servlet容器的支持下采用单实例多线程的工作模式,Servlet容器为你处理了并发问题。

误解和正解

最常见的对并发编程的误解有以下这些:
-并发总能改进性能(并发在CPU有很多空闲时间时能明显改进程序的性能,但当线程数量较多的时候,线程间频繁的调度切换反而会让系统的性能下降)
-编写并发程序无需修改原有的设计(目的与时机的解耦往往会对系统结构产生巨大的影响)
-在使用Web或EJB容器时不用关注并发问题(只有了解了容器在做什么,才能更好的使用容器)
下面的这些说法才是对并发客观的认识:
编写并发程序会在代码上增加额外的开销
-正确的并发是非常复杂的,即使对于很简单的问题
-并发中的缺陷因为不易重现也不容易被发现
-并发往往需要对设计策略从根本上进行修改

并发编程的原则和技巧

单一职责原则

分离并发相关代码和其他代码(并发相关代码有自己的开发、修改和调优生命周期)。

限制数据作用域

两个线程修改共享对象的同一字段时可能会相互干扰,导致不可预期的行为,解决方案之一是构造临界区,但是必须限制临界区的数量。

使用数据副本

数据副本是避免共享数据的好方法,复制出来的对象只是以只读的方式对待。Java 5的java.util.concurrent包中增加一个名为CopyOnWriteArrayList的类,它是List接口的子类型,所以你可以认为它是ArrayList的线程安全的版本,它使用了写时复制的方式创建数据副本进行操作来避免对共享数据并发访问而引发的问题。

线程应尽可能独立

让线程存在于自己的世界中,不与其他线程共享数据。有过Java Web开发经验的人都知道,Servlet就是以单实例多线程的方式工作,和每个请求相关的数据都是通过Servlet子类的service方法(或者是doGet或doPost方法)的参数传入的。只要Servlet中的代码只使用局部变量,Servlet就不会导致同步问题。springMVC的控制器也是这么做的,从请求中获得的对象都是以方法的参数传入而不是作为类的成员,很明显Struts 2的做法就正好相反,因此Struts 2中作为控制器的Action类都是每个请求对应一个实例。

Java 5以前的并发编程

Java的线程模型建立在抢占式线程调度的基础上,也就是说:

  • 所有线程可以很容易的共享同一进程中的对象。
  • 能够引用这些对象的任何线程都可以修改这些对象。
  • 为了保护数据,对象可以被锁住。

Java基于线程和锁的并发过于底层,而且使用锁很多时候都是很万恶的,因为它相当于让所有的并发都变成了排队等待。
在Java 5以前,可以用synchronized关键字来实现锁的功能,它可以用在代码块和方法上,表示在执行整个代码块或方法之前线程必须取得合适的锁。对于类的非静态方法(成员方法)而言,这意味这要取得对象实例的锁,对于类的静态方法(类方法)而言,要取得类的Class对象的锁,对于同步代码块,程序员可以指定要取得的是那个对象的锁。
不管是同步代码块还是同步方法,每次只有一个线程可以进入,如果其他线程试图进入(不管是同一同步块还是不同的同步块),JVM会将它们挂起(放入到等锁池中)。这种结构在并发理论中称为临界区(critical section)。这里我们可以对Java中用synchronized实现同步和锁的功能做一个总结:

  • 只能锁定对象,不能锁定基本数据类型
  • 被锁定的对象数组中的单个对象不会被锁定
  • 同步方法可以视为包含整个方法的synchronized(this) { … }代码块
  • 静态同步方法会锁定它的Class对象
  • 内部类的同步是独立于外部类的
  • synchronized修饰符并不是方法签名的组成部分,所以不能出现在接口的方法声明中
  • 非同步的方法不关心锁的状态,它们在同步方法运行时仍然可以得以运行
  • synchronized实现的锁是可重入的锁。

在JVM内部,为了提高效率,同时运行的每个线程都会有它正在处理的数据的缓存副本,当我们使用synchronzied进行同步的时候,真正被同步的是在不同线程中表示被锁定对象的内存块(副本数据会保持和主内存的同步,现在知道为什么要用同步这个词汇了吧),简单的说就是在同步块或同步方法执行完后,对被锁定的对象做的任何修改要在释放锁之前写回到主内存中;在进入同步块得到锁之后,被锁定对象的数据是从主内存中读出来的,持有锁的线程的数据副本一定和主内存中的数据视图是同步的 。
在Java最初的版本中,就有一个叫volatile的关键字,它是一种简单的同步的处理机制,因为被volatile修饰的变量遵循以下规则:

  • 变量的值在使用之前总会从主内存中再读取出来。
  • 对变量值的修改总会在完成之后写回到主内存中。

使用volatile关键字可以在多线程环境下预防编译器不正确的优化假设(编译器可能会将在一个线程中值不会发生改变的变量优化成常量),但只有修改时不依赖当前状态(读取时的值)的变量才应该声明为volatile变量。
不变模式也是并发编程时可以考虑的一种设计。让对象的状态是不变的,如果希望修改对象的状态,就会创建对象的副本并将改变写入副本而不改变原来的对象,这样就不会出现状态不一致的情况,因此不变对象是线程安全的。Java中我们使用频率极高的String类就采用了这样的设计。如果对不变模式不熟悉,可以阅读阎宏博士的《Java与模式》一书的第34章。说到这里你可能也体会到final关键字的重要意义了。

Java 5的并发编程

不管今后的Java向着何种方向发展或者灭亡,Java 5绝对是Java发展史中一个极其重要的版本,这个版本提供的各种语言特性我们不在这里讨论(有兴趣的可以阅读我的另一篇文章《Java的第20年:从Java版本演进看编程技术的发展》),但是我们必须要感谢Doug Lea在Java 5中提供了他里程碑式的杰作java.util.concurrent包,它的出现让Java的并发编程有了更多的选择和更好的工作方式。Doug Lea的杰作主要包括以下内容:

  • 更好的线程安全的容器
  • 线程池和相关的工具类
  • 可选的非阻塞解决方案
  • 显示的锁和信号量机制

下面我们对这些东西进行一一解读。

原子类

Java 5中的java.util.concurrent包下面有一个atomic子包,其中有几个以Atomic打头的类,例如AtomicInteger和AtomicLong。它们利用了现代处理器的特性,可以用非阻塞的方式完成原子操作,代码如下所示:

/**
 ID序列生成器
*/
public class IdGenerator {
    private final AtomicLong sequenceNumber = new AtomicLong(0);

    public long next() {
        return sequenceNumber.getAndIncrement(); 
    }
}

显示锁

基于synchronized关键字的锁机制有以下问题:

  • 锁只有一种类型,而且对所有同步操作都是一样的作用
  • 锁只能在代码块或方法开始的地方获得,在结束的地方释放
  • 线程要么得到锁,要么阻塞,没有其他的可能性

Java 5对锁机制进行了 重构,提供了显示的锁,这样可以在以下几个方面提升锁机制:

  • 可以添加不同类型的锁,例如读取锁和写入锁
  • 可以在一个方法中加锁,在另一个方法中解锁
  • 可以使用tryLock方式尝试获得锁,如果得不到锁可以等待、回退或者干点别的事情,当然也可以在超时之后放弃操作

显示的锁都实现了java.util.concurrent.Lock接口,主要有两个实现类:

  • ReentrantLock – 比synchronized稍微灵活一些的重入锁
  • ReentrantReadWriteLock – 在读操作很多写操作很少时性能更好的一种重入锁

对于如何使用显示锁,可以参考我的Java面试系列文章 《Java面试题集51-70》中第60题的代码。只有一点需要提醒,解锁的方法unlock的调用最好能够在finally块中,因为这里是释放外部资源最好的地方,当然也是释放锁的最佳位置,因为不管正常异常可能都要释放掉锁来给其他线程以运行的机会。

CountDownLatch

CountDownLatch是一种简单的同步模式,它让一个线程可以等待一个或多个线程完成它们的工作从而避免对临界资源并发访问所引发的各种问题。下面借用别人的一段代码(我对它做了一些重构)来演示CountDownLatch是如何工作的。

import java.util.concurrent.CountDownLatch;

/**
 * 工人类
 * @author 骆昊
 *
 */
class Worker {
    private String name;        // 名字
    private long workDuration;  // 工作持续时间

    /**
     * 构造器
     */
    public Worker(String name, long workDuration) {
        this.name = name;
        this.workDuration = workDuration;
    }

    /**
     * 完成工作
     */
    public void doWork() {
        System.out.println(name + " begins to work...");
        try {
            Thread.sleep(workDuration); // 用休眠模拟工作执行的时间
        } catch(InterruptedException ex) {
            ex.printStackTrace();
        }
        System.out.println(name + " has finished the job...");
    }
}

/**
 * 测试线程
 * @author 骆昊
 *
 */
class WorkerTestThread implements Runnable {
    private Worker worker;
    private CountDownLatch cdLatch;

    public WorkerTestThread(Worker worker, CountDownLatch cdLatch) {
        this.worker = worker;
        this.cdLatch = cdLatch;
    }

    @Override
    public void run() {
        worker.doWork();        // 让工人开始工作
        cdLatch.countDown();    // 工作完成后倒计时次数减1
    }
}

class CountDownLatchTest {

    private static final int MAX_WORK_DURATION = 5000;  // 最大工作时间
    private static final int MIN_WORK_DURATION = 1000;  // 最小工作时间

    // 产生随机的工作时间
    private static long getRandomWorkDuration(long min, long max) {
        return (long) (Math.random() * (max - min) + min);
    }

    public static void main(String[] args) {
        CountDownLatch latch = new CountDownLatch(2);   // 创建倒计时闩并指定倒计时次数为2
        Worker w1 = new Worker("骆昊", getRandomWorkDuration(MIN_WORK_DURATION, MAX_WORK_DURATION));
        Worker w2 = new Worker("王大锤", getRandomWorkDuration(MIN_WORK_DURATION, MAX_WORK_DURATION));

        new Thread(new WorkerTestThread(w1, latch)).start();
        new Thread(new WorkerTestThread(w2, latch)).start();

        try {
            latch.await();  // 等待倒计时闩减到0
            System.out.println("All jobs have been finished!");
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

ConcurrentHashMap

ConcurrentHashMap是HashMap在并发环境下的版本,大家可能要问,既然已经可以通过Collections.synchronizedMap获得线程安全的映射型容器,为什么还需要ConcurrentHashMap呢?因为通过Collections工具类获得的线程安全的HashMap会在读写数据时对整个容器对象上锁,这样其他使用该容器的线程无论如何也无法再获得该对象的锁,也就意味着要一直等待前一个获得锁的线程离开同步代码块之后才有机会执行。实际上,HashMap是通过哈希函数来确定存放键值对的桶(桶是为了解决哈希冲突而引入的),修改HashMap时并不需要将整个容器锁住,只需要锁住即将修改的“桶”就可以了。HashMap的数据结构如下图所示。
这里写图片描述

此外,ConcurrentHashMap还提供了原子操作的方法,如下所示:

  • putIfAbsent:如果还没有对应的键值对映射,就将其添加到HashMap中。
  • remove:如果键存在而且值与当前状态相等(equals比较结果为true),则用原子方式移除该键值对映射
  • replace:替换掉映射中元素的原子操作

CopyOnWriteArrayList

CopyOnWriteArrayList是ArrayList在并发环境下的替代品。CopyOnWriteArrayList通过增加写时复制语义来避免并发访问引起的问题,也就是说任何修改操作都会在底层创建一个列表的副本,也就意味着之前已有的迭代器不会碰到意料之外的修改。这种方式对于不要严格读写同步的场景非常有用,因为它提供了更好的性能。记住,要尽量减少锁的使用,因为那势必带来性能的下降(对数据库中数据的并发访问不也是如此吗?如果可以的话就应该放弃悲观锁而使用乐观锁),CopyOnWriteArrayList很明显也是通过牺牲空间获得了时间(在计算机的世界里,时间和空间通常是不可调和的矛盾,可以牺牲空间来提升效率获得时间,当然也可以通过牺牲时间来减少对空间的使用)。
这里写图片描述

可以通过下面两段代码的运行状况来验证一下CopyOnWriteArrayList是不是线程安全的容器。

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

class AddThread implements Runnable {
    private List<Double> list;

    public AddThread(List<Double> list) {
        this.list = list;
    }

    @Override
    public void run() {
        for(int i = 0; i < 10000; ++i) {
            list.add(Math.random());
        }
    }
}

public class Test05 {
    private static final int THREAD_POOL_SIZE = 2;

    public static void main(String[] args) {
        List<Double> list = new ArrayList<>();
        ExecutorService es = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
        es.execute(new AddThread(list));
        es.execute(new AddThread(list));
        es.shutdown();
    }
}

上面的代码会在运行时产生ArrayIndexOutOfBoundsException,试一试将上面代码25行的ArrayList换成CopyOnWriteArrayList再重新运行。

List<Double> list = new CopyOnWriteArrayList<>();

Queue

队列是一个无处不在的美妙概念,它提供了一种简单又可靠的方式将资源分发给处理单元(也可以说是将工作单元分配给待处理的资源,这取决于你看待问题的方式)。实现中的并发编程模型很多都依赖队列来实现,因为它可以在线程之间传递工作单元。
Java 5中的BlockingQueue就是一个在并发环境下非常好用的工具,在调用put方法向队列中插入元素时,如果队列已满,它会让插入元素的线程等待队列腾出空间;在调用take方法从队列中取元素时,如果队列为空,取出元素的线程就会阻塞。
这里写图片描述
可以用BlockingQueue来实现生产者-消费者并发模型(下一节中有介绍),当然在Java 5以前也可以通过wait和notify来实现线程调度,比较一下两种代码就知道基于已有的并发工具类来重构并发代码到底好在哪里了。

  • 基于wait和notify的实现
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

/**
 * 公共常量
 * @author 骆昊
 *
 */
class Constants {
    public static final int MAX_BUFFER_SIZE = 10;
    public static final int NUM_OF_PRODUCER = 2;
    public static final int NUM_OF_CONSUMER = 3;
}

/**
 * 工作任务
 * @author 骆昊
 *
 */
class Task {
    private String id;  // 任务的编号

    public Task() {
        id = UUID.randomUUID().toString();
    }

    @Override
    public String toString() {
        return "Task[" + id + "]";
    }
}

/**
 * 消费者
 * @author 骆昊
 *
 */
class Consumer implements Runnable {
    private List<Task> buffer;

    public Consumer(List<Task> buffer) {
        this.buffer = buffer;
    }

    @Override
    public void run() {
        while(true) {
            synchronized(buffer) {
                while(buffer.isEmpty()) {
                    try {
                        buffer.wait();
                    } catch(InterruptedException e) {
                        e.printStackTrace();
                    }
                }
                Task task = buffer.remove(0);
                buffer.notifyAll();
                System.out.println("Consumer[" + Thread.currentThread().getName() + "] got " + task);
            }
        }
    }
}

/**
 * 生产者
 * @author 骆昊
 *
 */
class Producer implements Runnable {
    private List<Task> buffer;

    public Producer(List<Task> buffer) {
        this.buffer = buffer;
    }

    @Override
    public void run() {
        while(true) {
            synchronized (buffer) {
                while(buffer.size() >= Constants.MAX_BUFFER_SIZE) {
                    try {
                        buffer.wait();
                    } catch(InterruptedException e) {
                        e.printStackTrace();
                    }
                }
                Task task = new Task();
                buffer.add(task);
                buffer.notifyAll();
                System.out.println("Producer[" + Thread.currentThread().getName() + "] put " + task);
            }
        }
    }

}

public class Test06 {

    public static void main(String[] args) {
        List<Task> buffer = new ArrayList<>(Constants.MAX_BUFFER_SIZE);
        ExecutorService es = Executors.newFixedThreadPool(Constants.NUM_OF_CONSUMER + Constants.NUM_OF_PRODUCER);
        for(int i = 1; i <= Constants.NUM_OF_PRODUCER; ++i) {
            es.execute(new Producer(buffer));
        }
        for(int i = 1; i <= Constants.NUM_OF_CONSUMER; ++i) {
            es.execute(new Consumer(buffer));
        }
    }
}
  • 基于BlockingQueue的实现
import java.util.UUID;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;

/**
 * 公共常量
 * @author 骆昊
 *
 */
class Constants {
    public static final int MAX_BUFFER_SIZE = 10;
    public static final int NUM_OF_PRODUCER = 2;
    public static final int NUM_OF_CONSUMER = 3;
}

/**
 * 工作任务
 * @author 骆昊
 *
 */
class Task {
    private String id;  // 任务的编号

    public Task() {
        id = UUID.randomUUID().toString();
    }

    @Override
    public String toString() {
        return "Task[" + id + "]";
    }
}

/**
 * 消费者
 * @author 骆昊
 *
 */
class Consumer implements Runnable {
    private BlockingQueue<Task> buffer;

    public Consumer(BlockingQueue<Task> buffer) {
        this.buffer = buffer;
    }

    @Override
    public void run() {
        while(true) {
            try {
                Task task = buffer.take();
                System.out.println("Consumer[" + Thread.currentThread().getName() + "] got " + task);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

/**
 * 生产者
 * @author 骆昊
 *
 */
class Producer implements Runnable {
    private BlockingQueue<Task> buffer;

    public Producer(BlockingQueue<Task> buffer) {
        this.buffer = buffer;
    }

    @Override
    public void run() {
        while(true) {
            try {
                Task task = new Task();
                buffer.put(task);
                System.out.println("Producer[" + Thread.currentThread().getName() + "] put " + task);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }

        }
    }

}

public class Test07 {

    public static void main(String[] args) {
        BlockingQueue<Task> buffer = new LinkedBlockingQueue<>(Constants.MAX_BUFFER_SIZE);
        ExecutorService es = Executors.newFixedThreadPool(Constants.NUM_OF_CONSUMER + Constants.NUM_OF_PRODUCER);
        for(int i = 1; i <= Constants.NUM_OF_PRODUCER; ++i) {
            es.execute(new Producer(buffer));
        }
        for(int i = 1; i <= Constants.NUM_OF_CONSUMER; ++i) {
            es.execute(new Consumer(buffer));
        }
    }
}

使用BlockingQueue后代码优雅了很多。

并发模型

在继续下面的探讨之前,我们还是重温一下几个概念:

概念 解释
临界资源 并发环境中有着固定数量的资源
互斥 对资源的访问是排他式的
饥饿 一个或一组线程长时间或永远无法取得进展
死锁 两个或多个线程相互等待对方结束
活锁 想要执行的线程总是发现其他的线程正在执行以至于长时间或永远无法执行

重温了这几个概念后,我们可以探讨一下下面的几种并发模型。

生产者-消费者

一个或多个生产者创建某些工作并将其置于缓冲区或队列中,一个或多个消费者会从队列中获得这些工作并完成之。这里的缓冲区或队列是临界资源。当缓冲区或队列放满的时候,生产这会被阻塞;而缓冲区或队列为空的时候,消费者会被阻塞。生产者和消费者的调度是通过二者相互交换信号完成的。

读者-写者

当存在一个主要为读者提供信息的共享资源,它偶尔会被写者更新,但是需要考虑系统的吞吐量,又要防止饥饿和陈旧资源得不到更新的问题。在这种并发模型中,如何平衡读者和写者是最困难的,当然这个问题至今还是一个被热议的问题,恐怕必须根据具体的场景来提供合适的解决方案而没有那种放之四海而皆准的方法(不像我在国内的科研文献中看到的那样)。

哲学家进餐

1965年,荷兰计算机科学家图灵奖得主Edsger Wybe Dijkstra提出并解决了一个他称之为哲学家进餐的同步问题。这个问题可以简单地描述如下:五个哲学家围坐在一张圆桌周围,每个哲学家面前都有一盘通心粉。由于通心粉很滑,所以需要两把叉子才能夹住。相邻两个盘子之间放有一把叉子如下图所示。哲学家的生活中有两种交替活动时段:即吃饭和思考。当一个哲学家觉得饿了时,他就试图分两次去取其左边和右边的叉子,每次拿一把,但不分次序。如果成功地得到了两把叉子,就开始吃饭,吃完后放下叉子继续思考。
把上面问题中的哲学家换成线程,把叉子换成竞争的临界资源,上面的问题就是线程竞争资源的问题。如果没有经过精心的设计,系统就会出现死锁、活锁、吞吐量下降等问题。
这里写图片描述
下面是用信号量原语来解决哲学家进餐问题的代码,使用了Java 5并发工具包中的Semaphore类(代码不够漂亮但是已经足以说明问题了)。

//import java.util.concurrent.ExecutorService;
//import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;

/**
 * 存放线程共享信号量的上下问
 * @author 骆昊
 *
 */
class AppContext {
    public static final int NUM_OF_FORKS = 5;   // 叉子数量(资源)
    public static final int NUM_OF_PHILO = 5;   // 哲学家数量(线程)

    public static Semaphore[] forks;    // 叉子的信号量
    public static Semaphore counter;    // 哲学家的信号量

    static {
        forks = new Semaphore[NUM_OF_FORKS];

        for (int i = 0, len = forks.length; i < len; ++i) {
            forks[i] = new Semaphore(1);    // 每个叉子的信号量为1
        }

        counter = new Semaphore(NUM_OF_PHILO - 1);  // 如果有N个哲学家,最多只允许N-1人同时取叉子
    }

    /**
     * 取得叉子
     * @param index 第几个哲学家
     * @param leftFirst 是否先取得左边的叉子
     * @throws InterruptedException
     */
    public static void putOnFork(int index, boolean leftFirst) throws InterruptedException {
        if(leftFirst) {
            forks[index].acquire();
            forks[(index + 1) % NUM_OF_PHILO].acquire();
        }
        else {
            forks[(index + 1) % NUM_OF_PHILO].acquire();
            forks[index].acquire();
        }
    }

    /**
     * 放回叉子
     * @param index 第几个哲学家
     * @param leftFirst 是否先放回左边的叉子
     * @throws InterruptedException
     */
    public static void putDownFork(int index, boolean leftFirst) throws InterruptedException {
        if(leftFirst) {
            forks[index].release();
            forks[(index + 1) % NUM_OF_PHILO].release();
        }
        else {
            forks[(index + 1) % NUM_OF_PHILO].release();
            forks[index].release();
        }
    }
}

/**
 * 哲学家
 * @author 骆昊
 *
 */
class Philosopher implements Runnable {
    private int index;      // 编号
    private String name;    // 名字

    public Philosopher(int index, String name) {
        this.index = index;
        this.name = name;
    }

    @Override
    public void run() {
        while(true) {
            try {
                AppContext.counter.acquire();
                boolean leftFirst = index % 2 == 0;
                AppContext.putOnFork(index, leftFirst);
                System.out.println(name + "正在吃意大利面(通心粉)...");   // 取到两个叉子就可以进食
                AppContext.putDownFork(index, leftFirst);
                AppContext.counter.release();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

public class Test04 {

    public static void main(String[] args) {
        String[] names = { "骆昊", "王大锤", "张三丰", "杨过", "李莫愁" };   // 5位哲学家的名字
//      ExecutorService es = Executors.newFixedThreadPool(AppContext.NUM_OF_PHILO); // 创建固定大小的线程池
//      for(int i = 0, len = names.length; i < len; ++i) {
//          es.execute(new Philosopher(i, names[i]));   // 启动线程
//      }
//      es.shutdown();
        for(int i = 0, len = names.length; i < len; ++i) {
            new Thread(new Philosopher(i, names[i])).start();
        }
    }

}

现实中的并发问题基本上都是这三种模型或者是这三种模型的变体。

测试并发代码

对并发代码的测试也是非常棘手的事情,棘手到无需说明大家也很清楚的程度,所以这里我们只是探讨一下如何解决这个棘手的问题。我们建议大家编写一些能够发现问题的测试并经常性的在不同的配置和不同的负载下运行这些测试。不要忽略掉任何一次失败的测试,线程代码中的缺陷可能在上万次测试中仅仅出现一次。具体来说有这么几个注意事项:

  • 不要将系统的失效归结于偶发事件,就像拉不出屎的时候不能怪地球没有引力。
  • 先让非并发代码工作起来,不要试图同时找到并发和非并发代码中的缺陷。
  • 编写可以在不同配置环境下运行的线程代码。
  • 编写容易调整的线程代码,这样可以调整线程使性能达到最优。
  • 让线程的数量多于CPU或CPU核心的数量,这样CPU调度切换过程中潜在的问题才会暴露出来。
  • 让并发代码在不同的平台上运行。
  • 通过自动化或者硬编码的方式向并发代码中加入一些辅助测试的代码。

Java 7的并发编程

Java 7中引入了TransferQueue,它比BlockingQueue多了一个叫transfer的方法,如果接收线程处于等待状态,该操作可以马上将任务交给它,否则就会阻塞直至取走该任务的线程出现。可以用TransferQueue代替BlockingQueue,因为它可以获得更好的性能。
刚才忘记了一件事情,Java 5中还引入了Callable接口、Future接口和FutureTask接口,通过他们也可以构建并发应用程序,代码如下所示。

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class Test07 {
    private static final int POOL_SIZE = 10;

    static class CalcThread implements Callable<Double> {
        private List<Double> dataList = new ArrayList<>();

        public CalcThread() {
            for(int i = 0; i < 10000; ++i) {
                dataList.add(Math.random());
            }
        }

        @Override
        public Double call() throws Exception {
            double total = 0;
            for(Double d : dataList) {
                total += d;
            }
            return total / dataList.size();
        }

    }

    public static void main(String[] args) {
        List<Future<Double>> fList = new ArrayList<>();
        ExecutorService es = Executors.newFixedThreadPool(POOL_SIZE);
        for(int i = 0; i < POOL_SIZE; ++i) {
            fList.add(es.submit(new CalcThread()));
        }

        for(Future<Double> f : fList) {
            try {
                System.out.println(f.get());
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

        es.shutdown();
    }
}

Callable接口也是一个单方法接口,显然这是一个回调方法,类似于函数式编程中的回调函数,在Java 8 以前,Java中还不能使用Lambda表达式来简化这种函数式编程。和Runnable接口不同的是Callable接口的回调方法call方法会返回一个对象,这个对象可以用将来时的方式在线程执行结束的时候获得信息。上面代码中的call方法就是将计算出的10000个0到1之间的随机小数的平均值返回,我们通过一个Future接口的对象得到了这个返回值。目前最新的Java版本中,Callable接口和Runnable接口都被打上了@FunctionalInterface的注解,也就是说它可以用函数式编程的方式(Lambda表达式)创建接口对象。
下面是Future接口的主要方法:

  • get():获取结果。如果结果还没有准备好,get方法会阻塞直到取得结果;当然也可以通过参数设置阻塞超时时间。
  • cancel():在运算结束前取消。
  • isDone():可以用来判断运算是否结束。

Java 7中还提供了分支/合并(fork/join)框架,它可以实现线程池中任务的自动调度,并且这种调度对用户来说是透明的。为了达到这种效果,必须按照用户指定的方式对任务进行分解,然后再将分解出的小型任务的执行结果合并成原来任务的执行结果。这显然是运用了分治法(divide-and-conquer)的思想。下面的代码使用了分支/合并框架来计算1到10000的和,当然对于如此简单的任务根本不需要分支/合并框架,因为分支和合并本身也会带来一定的开销,但是这里我们只是探索一下在代码中如何使用分支/合并框架,让我们的代码能够充分利用现代多核CPU的强大运算能力。

import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.Future;
import java.util.concurrent.RecursiveTask;

class Calculator extends RecursiveTask<Integer> {
    private static final long serialVersionUID = 7333472779649130114L;

    private static final int THRESHOLD = 10;
    private int start;
    private int end;

    public Calculator(int start, int end) {
        this.start = start;
        this.end = end;
    }

    @Override
    public Integer compute() {
        int sum = 0;
        if ((end - start) < THRESHOLD) {    // 当问题分解到可求解程度时直接计算结果
            for (int i = start; i <= end; i++) {
                sum += i;
            }
        } else {
            int middle = (start + end) >>> 1;
            // 将任务一分为二
            Calculator left = new Calculator(start, middle);
            Calculator right = new Calculator(middle + 1, end);
            left.fork();
            right.fork();
            // 注意:由于此处是递归式的任务分解,也就意味着接下来会二分为四,四分为八...

            sum = left.join() + right.join();   // 合并两个子任务的结果
        }
        return sum;
    }

}

public class Test08 {

    public static void main(String[] args) throws Exception {
        ForkJoinPool forkJoinPool = new ForkJoinPool();
        Future<Integer> result = forkJoinPool.submit(new Calculator(1, 10000));
        System.out.println(result.get());
    }
}

伴随着Java 7的到来,Java中默认的数组排序算法已经不再是经典的快速排序(双枢轴快速排序)了,新的排序算法叫TimSort,它是归并排序和插入排序的混合体,TimSort可以通过分支合并框架充分利用现代处理器的多核特性,从而获得更好的性能(更短的排序时间)。

参考文献

  1. Benjamin J. Evans, etc, The Well-Grounded Java Developer. Jul 21, 2012
  2. Robert Martin, Clean Code. Aug 11, 2008.
  3. Doug Lea, Concurrent Programming in Java: Design Principles and Patterns. 1999

from:http://www.importnew.com/22286.html

Springboot 整合 Dubbo/ZooKeeper 详解 SOA 案例

 

“看看星空,会觉得自己很渺小,可能我们在宇宙中从来就是一个偶然。所以,无论什么事情,仔细想一想,都没有什么大不了的。这能帮助自己在遇到挫折时稳定心态,想得更开。”  – 《腾讯传》
本文提纲
一、为啥整合 Dubbo 实现 SOA
二、运行 springboot-dubbo-server 和 springboot-dubbo-client 工程
三、springboot-dubbo-server 和 springboot-dubbo-client 工程配置详解
 

一、为啥整合 Dubbo 实现 SOA

Dubbo 不单单只是高性能的 RPC 调用框架,更是 SOA 服务治理的一种方案。
核心
1. 远程通信,向本地调用一样调用远程方法。
2. 集群容错
3. 服务自动发现和注册,可平滑添加或者删除服务提供者。
我们常常使用 Springboot 暴露 HTTP 服务,并走 JSON 模式。但慢慢量大了,一种 SOA 的治理方案。这样可以暴露出 Dubbo 服务接口,提供给 Dubbo 消费者进行 RPC 调用。下面我们详解下如何集成 Dubbo。

二、运行 springboot-dubbo-server 和 springboot-dubbo-client 工程

运行环境:JDK 7 或 8,Maven 3.0+
技术栈:SpringBoot 1.5+、Dubbo 2.5+、ZooKeeper 3.3+

 

1.ZooKeeper 服务注册中心
ZooKeeper 是一个分布式的,开放源码的分布式应用程序协调服务。它是一个为分布式应用提供一致性服务的软件,提供的功能包括:配置维护、域名服务、分布式同步、组服务等。
下载 ZooKeeper ,地址 http://www.apache.org/dyn/closer.cgi/zookeeper
解压 ZooKeeper
tar zxvf zookeeper-3.4.8.tar.gz
在 conf 目录新建 zoo.cfg ,照着该目录的 zoo_sample.cfg 配置如下。
cd zookeeper-3.3.6/conf
vim zoo.cfg
zoo.cfg 代码如下(自己指定 log 文件目录):
tickTime=2000
dataDir=/javaee/zookeeper/data 
dataLogDir=/javaee/zookeeper/log
clientPort=2181
在 bin 目录下,启动 ZooKeeper:
cd zookeeper-3.3.6/bin
./zkServer.sh start
2. git clone 下载工程 springboot-learning-example
git clone git@github.com:JeffLi1993/springboot-learning-example.git

然后,Maven 编译安装这个工程:

cd springboot-learning-example
mvn clean install
3.运行 springboot-dubbo-server Dubbo 服务提供者工程
右键运行 springboot-dubbo-server 工程 ServerApplication 应用启动类的 main 函数。Console 中出现如下表示项目启动成功:
这里表示 Dubbo 服务已经启动成功,并注册到 ZK (ZooKeeper)中。
4.运行 springboot-dubbo-client Dubbo 服务消费者工程

右键运行 springboot-dubbo-client 工程 ClientApplication 应用启动类的 main 函数。Console 中出现如下:

...
2017-03-01 16:31:38.473  INFO 9896 --- [           main] o.s.j.e.a.AnnotationMBeanExporter        : Registering beans for JMX exposure on startup
2017-03-01 16:31:38.538  INFO 9896 --- [           main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8081 (http)
2017-03-01 16:31:38.547  INFO 9896 --- [           main] org.spring.springboot.ClientApplication  : Started ClientApplication in 6.055 seconds (JVM running for 7.026)
City{id=1, provinceId=2, cityName='温岭', description='是我的故乡'}
最后打印的城市信息,就是通过 Dubbo 服务接口调用获取的。顺利运行成功,下面详解下各个代码及配置。

三、springboot-dubbo-server 和 springboot-dubbo-client 工程配置详解

1.详解 springboot-dubbo-server Dubbo 服务提供者工程
springboot-dubbo-server 工程目录结构
├── pom.xml
└── src
    └── main
        ├── java
        │   └── org
        │       └── spring
        │           └── springboot
        │               ├── ServerApplication.java
        │               ├── domain
        │               │   └── City.java
        │               └── dubbo
        │                   ├── CityDubboService.java
        │                   └── impl
        │                       └── CityDubboServiceImpl.java
        └── resources
            └── application.properties
a.pom.xml 配置

pom.xml 中依赖了 spring-boot-starter-dubbo 工程,该项目地址是 https://github.com/teaey/spring-boot-starter-dubbo。pom.xml 配置如下

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>springboot</groupId>
<artifactId>springboot-dubbo-server</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>springboot-dubbo 服务端:: 整合 Dubbo/ZooKeeper 详解 SOA 案例</name>

<!-- Spring Boot 启动父依赖 -->
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.5.1.RELEASE</version>
</parent>

<properties>
<dubbo-spring-boot>1.0.0</dubbo-spring-boot>
</properties>

<dependencies>

<!-- Spring Boot Dubbo 依赖 -->
<dependency>
<groupId>io.dubbo.springboot</groupId>
<artifactId>spring-boot-starter-dubbo</artifactId>
<version>${dubbo-spring-boot}</version>
</dependency>

<!-- Spring Boot Web 依赖 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>

<!-- Spring Boot Test 依赖 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>

<!-- Junit -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
</dependencies>
</project>

b.application.properties 配置

## Dubbo 服务提供者配置
spring.dubbo.application.name=provider
spring.dubbo.registry.address=zookeeper://127.0.0.1:2181
spring.dubbo.protocol.name=dubbo
spring.dubbo.protocol.port=20880
spring.dubbo.scan=org.spring.springboot.dubbo
这里 ZK 配置的地址和端口,就是上面本机搭建的 ZK 。如果有自己的 ZK 可以修改下面的配置。配置解释如下:
spring.dubbo.application.name 应用名称
spring.dubbo.registry.address 注册中心地址
spring.dubbo.protocol.name 协议名称
spring.dubbo.protocol.port 协议端口
spring.dubbo.scan dubbo 服务类包目录
c.CityDubboServiceImpl.java 城市业务 Dubbo 服务层实现层类
// 注册为 Dubbo 服务
@Service(version = "1.0.0")
public class CityDubboServiceImpl implements CityDubboService {

    public City findCityByName(String cityName) {
        return new City(1L,2L,"温岭","是我的故乡");
    }
}
@Service 注解标识为 Dubbo 服务,并通过 version 指定了版本号。
d.City.java 城市实体类
实体类通过 Dubbo 服务之间 RPC 调用,则需要实现序列化接口。最好指定下 serialVersionUID 值。
2.详解 springboot-dubbo-client Dubbo 服务消费者工程
springboot-dubbo-client 工程目录结构
├── pom.xml
└── src
    └── main
        ├── java
        │   └── org
        │       └── spring
        │           └── springboot
        │               ├── ClientApplication.java
        │               ├── domain
        │               │   └── City.java
        │               └── dubbo
        │                   ├── CityDubboConsumerService.java
        │                   └── CityDubboService.java
        └── resources
            └── application.properties
pom.xml 、 CityDubboService.java、City.java 没有改动。Dubbo 消费者通过引入接口实现 Dubbo 接口的调用。
a.application.properties 配置
## 避免和 server 工程端口冲突
server.port=8081

## Dubbo 服务消费者配置
spring.dubbo.application.name=consumer
spring.dubbo.registry.address=zookeeper://127.0.0.1:2181
spring.dubbo.scan=org.spring.springboot.dubbo
因为 springboot-dubbo-server 工程启动占用了 8080 端口,所以这边设置端口为 8081。
b.CityDubboConsumerService.java 城市 Dubbo 服务消费者
@Component
public class CityDubboConsumerService {

    @Reference(version = "1.0.0")
    CityDubboService cityDubboService;

    public void printCity() {
        String cityName="温岭";
        City city = cityDubboService.findCityByName(cityName);
        System.out.println(city.toString());
    }
}
@Reference(version = “1.0.0”) 通过该注解,订阅该接口版本为 1.0.0 的 Dubbo 服务。
这里将 CityDubboConsumerService 注入 Spring 容器,是为了更方便的获取该 Bean,然后验证这个 Dubbo 调用是否成功。
c.ClientApplication.java 客户端启动类
@SpringBootApplication
public class ClientApplication {

    public static void main(String[] args) {
        // 程序启动入口
        // 启动嵌入式的 Tomcat 并初始化 Spring 环境及其各 Spring 组件
        ConfigurableApplicationContext run = SpringApplication.run(ClientApplication.class, args);
        CityDubboConsumerService cityService = run.getBean(CityDubboConsumerService.class);
        cityService.printCity();
    }
}
解释下这段逻辑,就是启动后从 Bean 容器中获取城市 Dubbo 服务消费者 Bean。然后调用该 Bean 方法去验证 Dubbo 调用是否成功。

四、小结

还有涉及到服务的监控,治理。这本质上和 SpringBoot 无关,所以这边不做一一介绍。感谢阿里 teaey 提供的 starter-dubbo 项目。

垃圾回收原来是这么回事

最近想复习一下JVM的知识。然后发现网上不少文章在写JVM的垃圾回收算法时,都比较偏向于具体实现,而少有站在更高角度来看垃圾回收算法的文章。而我本人想对垃圾回收算法有个全景的认识,所以,就找到了这本《垃圾回收的算法与实现》(以下简称《垃圾回收》)。本篇博客就是尝试对“全景”的总结。

以下为方便讨论,垃圾回收缩写成GC。

  为什么要有GC

我时而听到C++程序员说我们是被GC惯坏了的一代。的确是这样的,我本人在学习GC算法时,大脑里第一问题就是为什么需要GC这样的东西。说明我已经认为GC是理所当然了。

总的一句话:没有GC的世界,我们需要手动进行内存管理,而手动内存管理是纯技术活,又容易出错。

既然我们写的大多程序都是为了解决现实业务问题,那么,我们为什么不把这种纯技术活自动化呢?但是自动化,也是有代价的。 这是我的个人理解,不代表John McCarthy本人的理解

  “垃圾”的定义

首先,我们要给个“垃圾”的定义,才能进行回收吧。书中给出的定义:把分配到堆中那些不能通过程序引用的对象称为非活动对象,也就是死掉的对象,我们称为“垃圾”。

  GC的定义

因为我们期望让内存管理变得自动(只管用内存,不管内存的回收),我们就必须做两件事情: 1. 找到内存空间里的垃圾;2. 回收垃圾,让程序员能再次利用这部分空间 。(《垃圾回收》 P2)只要满足这两项功能的程序,就是GC,不论它是在JVM中,还是在Ruby的VM中。

但这只是两个需求,并没有说明GC应该何时找垃圾,何时回收垃圾等等更具体的问题,各类GC算法就是在这些更具体问题的处理方式上施展手脚。

  GC的历史

John McCarthy身为Lisp之父和人工智能之父,同时,他也是GC之父。1960年,他在其 论文中首次发布了GC算法(其实是委婉的提出)。

《垃圾回收》的作者认为:

从50年前GC算法首次发布以来,众多研究者对其进行了各种各样的研究,因此许多GC算法也得以发布。但事实上,这些算法只不过是把前文中提到的三种算法进行组合或应用。也可以这么说,1963年GC复制算法诞生时,GC的根本性内容就已经完成了。(《垃圾回收》 P4)

那我们常常听说的分代垃圾回收又是怎么回事?作者是这样说的:人们从众多程序案例中总结出了一个经验:“大部分的对象在生成后马上就变成了垃圾,很少有对象能活得很久”。分代垃圾回收利用该经验,在对象中导入了“年龄”的概念,经历过一次GC后活下来的对象年龄为1岁。(垃圾回收》 P141)

分代垃圾回收中把对象分类成几代,针对不同的代使用不同的GC算法,我们把刚生成的对象称为新生代对象,到达一定年龄的对象则称为老年代对象。(《垃圾回收》 P142)

好了,这下我总算知道为什么要分代了,我的总结是: 将对象根据存活概率进行分类,对存活时间长一些的对象,可以减少扫描“垃圾”的时间,以减少GC频率和时长。 根本思路就是对对象进行分类,才能针对各个分类采用不同的垃圾回收算法,以对各算法进行扬长避短。

留一个问题给读者:我们知道分代垃圾回收所采用的堆结构是:

为什么新生代空间要分成“生成空间”和“幸存空间”,而幸存空间又分成两块大小相等的幸存空间1,幸存空间2?

  这些GC算法共同解决的问题

上面我们说了,GC的定义只给出了需求,三种算法都为实现这个需求,那么它们总会遇到共同要解决的问题吧? 我尝试总结出:

  • 如何分辨出什么是垃圾?
  • 如何、何时搜索垃圾?
  • 如何、何时清除垃圾?

这样,只要涉及到垃圾回收,我就可以从这2点需求,3个共同问题(两点三共)出发来讨论、学习。

  如何评价GC算法?

如果没有评价标准,我们当然无法评估这些GC算法的性能。作者给出了4个标准:

  • 吞吐量: 单位时间内的处理能力
  • 最大暂停时间:GC执行过程中,应用暂停的时长。较大的吞吐量和较短的最大暂停时间不可兼得
  • 堆的使用效率:就是堆空间的利用率。可用的堆越大,GC运行越快;相反,越想有效地利用有限的堆,GC花费的时间就越长。
  • 访问的局部性:把具有引用关系的对象安排在堆中较近的位置,就能提高在缓存中读取到想利用的数据的概率。

好吧。两点三共,四标~

  小结

  搞清楚为什么要GC,要实现GC都要解决什么问题,而各类算法又是怎么解决的,最后怎么评价这些算法。GC原来是这么回事。

但是这不是GC的全部。但是提供我一个思考GC的思考框架。

以上就是《垃圾回收的算法与实现》的读书笔记。如果想更深入,可以阅读《垃圾回收算法手册:自动内存管理的艺术》。

from:http://kb.cnblogs.com/page/562433/