Tuesday, August 24, 2010

Sizing Oracle Coherence Applications

The following is the last of the five posts which will list the best practices and performance suggestions for tuning one’s Oracle Coherence environment.  This is a general guide that will evolve as new product features are added to Oracle Coherence.

Sizing Oracle Coherence

Remember there are 4 types of virtual machines in Coherence:
  1. JVMs that are used for data storage (“storage JVMs”)
  2. JVMs that run client applications and do not store data (“client JVMs”)
  3. JVMs that run client applications and connect to the cluster via Coherence*Extend (TCP/IP)
  4. .NET applications that run client applications and connect to the cluster via Coherence*Extend (TCP/IP)
The protocol for storage and client JVMs (type 1 and 2 above) is TCMP (“Tangosol Cluster Management Protocol”, based on UDP unicast).

 

Tips for Sizing Oracle Coherence

  • Allow extra space for overhead
  • Every object has one full backup on another JVM on another machine
  • If a JVM fails, Oracle Coherence automatically fails over AND creates new backups on other JVMs
  • This means that if a JVM fails, other JVMs will need to accommodate the backups of the objects
  • Rule-of-thumb: each 1 GB JVM can store 350 MB of actual object data
  • That means a 16 GB machine will support about 4.5 GB of raw object data. 13 JVMs * 350 MB = 4.55 GB


    Question #1:    How many 1 GB JVMs can you run on a box with 16 GB of RAM?

    Answer: At most 13
    Start with 16 GB of RAM
    Subtract RAM required for OS and other apps
    / divide by 1.2 (remember, 1 GB of heap uses 1.2 GB of RAM)
    (16 GB – 400 MB ) / 1.2 GB = ~ 13


    Question #2:    How many 16 GB machines will be required to support 20 GB of data in the grid?

    Answer: At least five (six for HA)
    * Each JVM handles 350 MB
    * You have 13 JVMs per machine
    * You have 4.5 GB per machine (13 * 350 MB)
    * 20 GB / 4.5 GB per box = 4.44
    * Round up to 5


    I hope these series of five posts on Oracle Coherence have been helpful to you!

    Tuesday, August 17, 2010

    Tuning Oracle Coherence*Web Applications

    The following is the fourth of five posts which will list the best practices and performance suggestions for tuning one’s Oracle Coherence environment.  This is a general guide that will evolve as new product features are added to Oracle Coherence.

    Oracle Coherence*Web

    Oracle Coherence*Web is an HTTP session management module and a drop-in replacement for application server container session management. It basically “wraps” existing web applications, no runtime byte code manipulation is done and any requests to use sessions (from servlets, JSPs, filters, etc) are intercepted by Oracle Coherence*Web wrappers.  For more details you can look at:

    Coherence*Web Session Management Module
    Coherence*Web and WebLogic Server
    Coherence*Web and Other Application Server Containers

     

    High-Level Steps to enable Oracle Coherence*Web

    • Run the inspector on the existing WAR/EAR file (This generates a coherence-web.xml configuration file. This file wraps all servlets, filters, etc with Coherence implementations. It also contains configuration settings for Coherence*Web)
    • Inspect and (if any changes are required) modify the coherence-web.xml file
    • Run the installer process on the existing WAR/EAR which generates a new WAR/EAR and backs up the original WAR/EAR.
    • You now deploy the new WAR/EAR to the Application Server Container
    • The complete steps for the Oracle WebLogic Server are listed here.

     

    Troubleshooting Oracle Coherence*Web

    • Obtain a baseline for the application without Coherence to properly determine how sessions are being used and replicated. This will make it easier to compare with Coherence and further troubleshoot.
    • Network throughput can be an issue as well. Run a datagram test to determine how much one can push between machines. This will help tune the network between the web application tier and the data grid. Review the first post in this series and specifically in the networking section.
    • The session model will be a factor in the performance; the split session model is default and will keep small session attributes in the near cache while large ones will be accessed from the grid. If the application regularly uses lots of large attributes, another model may be more appropriate. Review the Session Models at this link.
    • Review their cache configuration file. From a web application caching perspective, Coherence*Web in a web-app really gets a big benefit from a near caching scheme, where objects of a size less than 1K are kept in the local JVM, avoiding the network hop and marshalling/deserialization.
    • If one is deploying multiple web applications, sometimes it's desirable to share session attributes and sometimes it is not. There is configuration for scoping link on this page.
    • For sizing it depends on the web-application and how the web-application uses the session to determine the proper size of the grid. Some testing with some average test cases should be used to arrive at a metric such as “1 user takes X MB in the grid”.

    Tuesday, August 10, 2010

    Tuning/Troubleshooting Oracle Coherence Applications

    The following is the third of five posts which will list the best practices and performance suggestions for tuning one’s Oracle Coherence environment.  This is a general guide that will evolve as new product features are added to Oracle Coherence.


    Troubleshooting Multicast Issues  

     

    1. If you have Oracle Coherence installed on the hosts between which you're testing multicasting, you can use its multicast connectivity test
    2. In addition you can use its datagram test to measure network throughput. The practical max on a well-tuned gigabit Ethernet link is ~115MB/sec.
    3. Finally make sure to use: -Djava.net.preferIPv4Stack=true
    4. Optionally one can use Well-Known-Addresses (WKA or Unicast) to eliminate any multicast issue.
    5. If one is on Windows 2003, 2008, Vista, or Windows 7 and are experiencing problems with sharing ports for multicast check the registry for HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Afd\Parameters\DisableAddressSharing and see if it is set to “1”. If so change this to “0”, reboot the machine, and retest. From Microsoft: "Enhanced socket security was added with the release of Windows Server 2003. In previous Microsoft server operating system releases, the default socket security easily allowed processes to hijack ports from unsuspecting applications. In Windows Server 2003, sockets are not in a sharable state by default. Therefore, if an application wants to allow other processes to reuse a port on which a socket is already bound, it must specifically enable it. "
    6. If one is on Windows, run the following command to generate some further information on the machine’s networking:
    netsh firewall show state verbose=enable

     

    Log Messages Explanation


    Review the following link for causes and actions to common TCMP (Tangosol Cluster Management Protocol, based on UDP unicast) log messages from Coherence.

    Tuesday, August 3, 2010

    Troubleshooting Checklist for Oracle Coherence Applications

    The following is the second of five posts which will list the best practices and performance suggestions for tuning one’s Oracle Coherence environment.  This is a general guide that will evolve as new product features are added to Oracle Coherence.

    General Oracle Coherence Performance Questions

    The following is a general list of questions to review when troubleshooting performance issues with Coherence.
    1. What application server is being used in conjunction with Oracle Coherence?
    2. Is Oracle Coherence being run within the same JVM as the app-server container or is the data grid setup outside of the app-server container? (i.e. is storage disabled here with –Dtangosol.coherence.distributed.localstorage=false )
    3. How many storage nodes are being used for Oracle Coherence? (Is there adequate storage for all the data?)
    4. What size is the java heaps for these storage nodes?
    5. Are the out-of-the-box Oracle Coherence configuration files being used from within coherence.jar? (i.e. Coherence itself has not been tuned to the environment/application?) See Sample Cache Configuration Files for details.
    6. Are configuration files specified via a –D flag to the Oracle Coherence Cache Servers or within a jar file? i.e. -Dtangosol.coherence.override=<file> and -Dtangosol.coherence.cacheconfig=<file> being used?
    7. What is the Thread-count set for Oracle Coherence?
    8. What type of partitioning is being used? Is a near-cache being used or replicated? “Partitioned/Distributed cache gives a real linear scalability and should be used in pretty much all scenarios. With Replicated cache the same data are copied over to all the nodes and is very performance taxing if data are changed.” Information on the near-cache, partitioned cache and replicated cache.
    9. Multicast or Unicast? (Review the Multicast Troubleshooting section below.)
    10. Is this a 32 bit JVM or a 64 bit JVM? JRockit or Sun JVM?
    11. What garbage collection algorithm is being used?
    12. Review the Platform-Specific Deployment Considerations section of the documentation.