Quantcast
Channel: Blog dbi services
Viewing all articles
Browse latest Browse all 1431

WebLogic Server process takes 100% CPU

$
0
0

During some monitoring work I niticed taht top shows the process 12013 using 800% CPU (We have a 12 CPU machine) and had to find out what’s happening !!

I used top to check the process ID and then the threads taking the CPU

ps -ef |grep 12013

weblogic 12013 11836 23 Apr03 ?        03:38:31 /app/weblogic/Java/jdk/bin/java -server -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dweblogic.Name=msD2-02 -Djava.security.policy=/app/weblogic/Middleware/wlserver/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true -Dweblogic.system.BootIdentityFile=/app/weblogic/domains/myDomain/servers/msD2-02/data/nodemanager/boot.properties -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.nmservice.RotationEnabled=true -Dweblogic.security.SSL.ignoreHostnameVerification=false -Dweblogic.ReverseDNSAllowed=false -Xms2048m -Xmx2048m -XX:MaxMetaspaceSize=512m -Dcom.sun.xml.ws.api.streaming.XMLStreamReaderFactory.woodstox=true -Dcom.sun.xml.ws.api.streaming.XMLStreamWriterFactory.woodstox=true -Djava.io.tmpdir=/app/weblogic/tmp/myDomain/msD2-02 -Ddomain.home=/app/weblogic/domains/myDomain -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.security.SSL.protocolVersion=TLS1 -Dweblogic.security.disableNullCipher=true -Djava.security.egd=file:///dev/./urandom -Dweblogic.security.allowCryptoJDefaultJCEVerification=true -Dweblogic.nodemanager.ServiceEnabled=true -Djava.endorsed.dirs=/app/weblogic/Java/jdk1.8.0_45/jre/lib/endorsed:/app/weblogic/Middleware/wlserver/../oracle_common/modules/endorsed -da -Dwls.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.home=/app/weblogic/Middleware/wlserver/server -Dweblogic.management.server=https://10.10.1.1:7001 -Dweblogic.utils.cmm.lowertier.ServiceDisabled=true weblogic.Server

top -H -b -p 12013

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12624 weblogic  20   0 10.5g 3.6g  30m S 83.1 11.5  18053:38 java
 2540 weblogic  20   0 10.5g 3.6g  30m S 79.4 11.5   3:11.62 java
 2591 weblogic  20   0 10.5g 3.6g  30m S 79.4 11.5   3:08.02 java
12436 weblogic  20   0 10.5g 3.6g  30m S  9.2 11.5   3:14.59 java

Now that I have the Unix threads PID, I will get the thread dump of the WebLogic Server process and try to figure out the culprits. To generate the thread dump I used the jstack java tool.

jstack -l 12013 >  $HOME/jstack_100_CPU.txt

The Thread dump provides the threads with the ID in Hexadecimal, thus those provided from the top needs to be converted:

[myDomain]$ printf '%x\n' 12624
3150
[myDomain]$  printf '%x\n' 2540
9ec
[myDomain]$  printf '%x\n' 2591
a1f
[myDomain]$
"LDAPConnThread-1172 ldaps://ldapserver.dbi-lab.com:3636" #167119 daemon prio=5 os_prio=0 tid=0x00007f5444087800 nid=0xa1f runnable [0x00007f5409e88000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked  (a sun.nio.ch.Util$2)
        - locked  (a java.util.Collections$UnmodifiableSet)
        - locked  (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.readInternal(NIOSocketMuxer.java:802)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:746)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:729)
        at weblogic.socket.JSSEFilterImpl.readFromNetwork(JSSEFilterImpl.java:462)
        at weblogic.socket.JSSEFilterImpl.read(JSSEFilterImpl.java:424)
        at weblogic.socket.JSSESocket$JSSEInputStream.read(JSSESocket.java:98)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
        - locked  (a java.io.BufferedInputStream)
        at netscape.ldap.ber.stream.BERElement.getElement(Unknown Source)
        at netscape.ldap.LDAPConnThread.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:745)

 
"[STUCK] ExecuteThread: '302' for queue: 'weblogic.kernel.Default (self-tuning)'" #326 daemon prio=1 os_prio=0 tid=0x00007f5560376000 nid=0x3150 runnable [0x00007f543636e000]
   java.lang.Thread.State: RUNNABLE
        at java.io.InputStream.skip(InputStream.java:224)
        at weblogic.utils.http.HttpChunkInputStream.skip(HttpChunkInputStream.java:287)
        at weblogic.utils.http.HttpChunkInputStream.skipAllChunk(HttpChunkInputStream.java:497)
        at weblogic.servlet.internal.ServletInputStreamImpl.ensureChunkedConsumed(ServletInputStreamImpl.java:51)
        at weblogic.servlet.internal.ServletRequestImpl.skipUnreadBody(ServletRequestImpl.java:217)
        at weblogic.servlet.internal.ServletRequestImpl.reset(ServletRequestImpl.java:169)
        at weblogic.servlet.internal.HttpConnectionHandler.prepareRequestForReuse(HttpConnectionHandler.java:258)
        at weblogic.servlet.internal.HttpConnectionHandler.requeue(HttpConnectionHandler.java:665)
        at weblogic.servlet.internal.VirtualConnection.requeue(VirtualConnection.java:332)
        at weblogic.servlet.internal.ServletResponseImpl.send(ServletResponseImpl.java:1657)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1582)
        at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:255)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:311)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:263)

"LDAPConnThread-1171 ldaps://ldapserver.dbi-lab.com:3636"" #167118 daemon prio=5 os_prio=0 tid=0x00007f5544048000 nid=0x9ec runnable [0x00007f540a28c000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method)
        at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317)
        at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:193)
        - locked  (a java.lang.Object)
        at java.nio.channels.spi.AbstractSelector$1.interrupt(AbstractSelector.java:213)
        at java.nio.channels.spi.AbstractSelector.begin(AbstractSelector.java:219)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:78)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked  (a sun.nio.ch.Util$2)
        - locked  (a java.util.Collections$UnmodifiableSet)
        - locked  (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.readInternal(NIOSocketMuxer.java:802)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:746)
        at weblogic.socket.NIOSocketMuxer$NIOInputStream.read(NIOSocketMuxer.java:729)
        at weblogic.socket.JSSEFilterImpl.readFromNetwork(JSSEFilterImpl.java:462)
        at weblogic.socket.JSSEFilterImpl.read(JSSEFilterImpl.java:424)
        at weblogic.socket.JSSESocket$JSSEInputStream.read(JSSESocket.java:98)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
        - locked  (a java.io.BufferedInputStream)
        at netscape.ldap.ber.stream.BERElement.getElement(Unknown Source)
        at netscape.ldap.LDAPConnThread.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:745)

This looks to be an issue with the LDAP server or the network.

The WebLogic java threads loops on the LDAP connection to get data. This loop should fail in a timeout depending on the Authentication provider LDAP connection timeout settings. But it looks there is an issue with the JDK that reinitiates the connection in the sun.nio.ch.EPollArrayWrapper class.

Once the LDAP connection was back the WebLogic Domain could be restarted properly and the service was back.

A possible workaround could be to not use the weblogic.socket.NIOSocketMuxer but the weblogic.socket.PosixSocketMuxer. This workaround is described in Oracle Suport note:2128032.1
Or a better solution is to upgrade to the latest WebLogic Software version and apply the latest PSU where this issue is fixed.

Cet article WebLogic Server process takes 100% CPU est apparu en premier sur Blog dbi services.


Viewing all articles
Browse latest Browse all 1431

Trending Articles