6/22/2007 10:46 AM | |
Posts: 8 Rating: (0) |
We have implemented a new control system for the keep track vehicle location and monitor alarm from vehicle's stop. This system contain 2 redundancy servers connected with 69 pcs S7 (s-300) RTU and 13 client workstations running in 7days 24 hours. We discover that the workstations will shown "No connection to data server" in the alarm banner after running a period of time. We have check the database on the WinCC server (which the problem client connected with), and found that the collection of alarm data stopped. The alarm banner on all clients connected with this server were stopped also. But another WinCC server seem running normal as well as the clients its connected with. Many corrective action had been done including: 1. Upgrade the server hardware form PIII 800Mhz to Dual Core2Duo 2. Windows OS upgrade from WINNT 4.0 to WIN2K (Both Servers & all workstations) 3. Enable dual LAN ports and LAN cables between network switch & servers replaced. This new system has been develop since 2004. The first case of the "No connection to data server" reported at Dec/2005, and it seem that we have no idea how to solve it from the problem occur till now. We have done many corrective action but in vain (e.g. exchange the hardware and software, networking etc...). Since this very serious affect our operation andwould any expertget helpto solve this problem? I have captured the diagnostic log for the case occurred on 5 Jun 07 from 13 workstations and 2 WinCC servers. I hope it mayprovide more informationfor analysis. WinCC Server Configuration: Hardware : Primergy TX200 S3 (Quad-Core Xeon 5000 2.33Mhz) Operating System: Microsoft Windows 2000 ServerEdition Operating System Service Pack : Microsoft Windows 20000 Service Pack 4 SIMATIC Software Packages: SIMATIC WinCC V5.1 SP2 SIMATIC NET PC Software V6.0 SP5 HF2 SIMATIC NET Industrial Ethernet-S7 V5.2 SIMATIC WinCC Redundancy V5.1 SP2 SIMATIC WinCC Server V5.1 SP2 SIMATIC WinCC Storage V5.1 SP2 SIMATIC WinCC User Archives V5.1 SP2 WinCC Workstation Configuration: Hardware :Fujitsu Siemens CELSIUS 670 Workstation (Dual-Core Xeon 1.5Mhz) Operating System: Microsoft Windows 2000 Edition Operating System Service Pack : Microsoft Windows 20000 Service Pack 4 SIMATIC Software Packages: SIMATIC WinCC V5.1 SP2 Here are the reports for the "No connection to data server" cause the WinCC server need to be reboot. 25/3/07 15:03 - Reboot Server 1, 29/3/07 15:00 - Reboot Server 2, 16/5/0703:00 - Reboot Server 1, 20/5/07 19:30 -Reboot Server 1 & Server2, 23/5/0709:00 - Reboot Server 2, 24/5/0709:44 - Reboot Server 2, 25/5/07 13:32 - Reboot Server 1, 26/5/07 11:42 - Reboot Server 1, 27/5/07 09:00 - Reboot Server 2, 28/5/07 10:10 - Reboot Server 2, 31/5/07 15:25 - Reboot Server 1 & Server 2, 02/6/07 20:00 - Reboot Server 1, 03/6/07 21:30 - Reboot Server 1, 04/6/0716:10 - Reboot Server 1, 05/6/07 10:00 - Reboot Server 1(NEW). I will verygrateful if anyone could help us to solve this problem!! Thank you again!! Best Regards, Kant
AttachmentDiag.zip (525 Downloads) |
6/22/2007 10:50 AM | |
Posts: 8 Rating: (0) |
The data log for the case occurred on 5 Jun 07, including the screen dump for workstation, alarm log forboth servers for your analysis. Furthermore, I have attached the WinCC Application system configuration, export list for test library & alarm logging as well as the diagnostic from 13 workstations and 2 WinCC servers. I hope it mayprovide more informationfor analysis. Here is the Log data for WinCC Server 1 AttachmentSrv1Data.zip (510 Downloads) |
6/22/2007 10:53 AM | |
Posts: 8 Rating: (0) |
Here is the log data for WinCC Server 2.
AttachmentSrv2Data.zip (432 Downloads) |
6/22/2007 10:56 AM | |
Posts: 8 Rating: (0) |
Here is our system configuration.
AttachmentSystem_Config.zip (535 Downloads) |
6/22/2007 1:24 PM | |
Posts: 3149 Rating: (171)
|
Hi I guess that this is causing a lot of trouble: 255,05.06.2007,10:40:16:513,1007000,4,,WINCC_SRV1_KCR,SCRIPT,ActionOverflow:more than 5000 Actions to work 255,05.06.2007,10:45:27:330,1007008,4,,WINCC_SRV1_KCR,SCRIPT,EndAct Timeout 255,05.06.2007,10:45:27:393,1007000,4,,WINCC_SRV1_KCR,SCRIPT,ActionOverflow:more than 5000 Actions to work => please check this FAQ http://support.automation.siemens.com/WW/view/en/2357302 nemo |
6/22/2007 4:58 PM | |
Posts: 1275 Rating: (123)
|
Hello Kant, I see you've posted the diagnose folders, that's good. In the e-mail, I also asked you this, did you look at it? I notice you're using v5.x. V5.0 has a limitation on the amount of messages to be displayed on an alarm window of 10000, and a limitation on the database size of 2GB. This has been improved on v6, where there are no limitations regarding database size (except, naturally, for the hard disk size). You should check the database size... WinCC V5 has two archives, a short-term and a long-term archive. In the Alarm Logging Editor you create these archives AND specify how much data to store in each one. Thread 1, Thread 2, Thread 3, Thread 4. Best regards, Danielle |
7/3/2007 10:24 AM | |
Posts: 8 Rating: (0) |
Dear Nemo, I tried to capture the "ActionOverflow:more than 5000 Actions to work" problem after rebooting the WinCC server but I can'tfound the same message in the log again but the WinCC's "No connection to data server problem" still exist and need to reboot the server. The recent case occur in 1 Jul 07 1:48 and I have captured the diagnose log for your reference. Is there any method or a piece of program script could capture the element (e.g. in pictures / global script / function) which cause the problem occur and we could further investigate the problem.
Thanks a lot!! Best Regards, Kant AttachmentDiagnose.zip (474 Downloads) |
7/3/2007 11:47 AM | |
Posts: 8 Rating: (0) |
Dear Danielle, I spend a period of time to inspected the possibility according tothe thread you provided. For Thread 1 : -------------------- I have verified the Text Library content with the Alarm Logging, same alarm message appear in both place. Since we haveonlydefine language in English only, all other language (e.g. German) were blank. The content onText Libraryand Alarm Logging seem in line. I have change one of the alarm message (delete the message content) in text library and let it no in line with the alarm logging deliberately in order to simulate the fault. But I only got a blank message in the alarm banner, the "No connection to data server" message haven't appear. I have read this thread before but I can't locate what is missing in the text library, so I have upload thedump of thecontent for text library and alarm library in previous attachment. And it may give you more information for solving this problem. For Thread 2 : -------------------- Alarm Controlobject seem working well when no this problem occur, it could display all the alarm obtain from S7 RTU. And we running in two redundancy servers while this problem occur, all the clients connected with this server will stopped to collect the alarm and display "No connection to data server" in the alarm bannerbut anther serverstill running well. All the client connected with this another server running normal also. And I believe that Alarm Controlobject should be OK otherwise all the client will display the "No connection to data server" immediately when the project start up. For Thread 3 : -------------------- We tried to search for any error in the RT database using the SCVIEW command from the sybase. No any patency error could be observed and the data logged in the database seem normal. And the database size around 1.1G still not excess2G limit. (see the attachment for the screen captured) For Thread 4 : -------------------- We tried to empty theruntime and the archive database (short-term and a long-term archive) by replacing aclean empty database into the project before and it seem no help on solving this problem. This problem still occur after few day of the action. We also have consider to upgrade the WinCC form V5 to V6, but since manyDLL have already developed using the call by sybase. Reprogramming all the DLL may not possible. All the available patch for WinCC and Windows has be applied, Do any suggest that we could trace the root causeof this problem? Since we could not locate the root cause, so we could not solve it.................. Thanks for the kindly support!! Best Regards, Kant AttachmentDB_Size.zip (417 Downloads) |
Last edited by: Kant at: 03.07.2007 11:49 |
|
7/26/2007 7:45 AM | |
Posts: 3149 Rating: (171)
|
Hi please attach you latest WinCC logfiles... nemo |
7/26/2007 8:42 AM | |
Posts: 8 Rating: (0) |
Dera Nemo, Here is out last diag log captured from two WinCC Servers. And eachtime after reboot the server, we will clear the diag folder in order not to mixed up the data. TheWinCC server 1 was rebooted on 22 July 07 and 25 July 07. The diag log has captured the data between these period. Thanks for your help!! Best Regards, Kant AttachmentSRV_Diag.zip (378 Downloads) |
7/26/2007 2:22 PM | |
Posts: 1275 Rating: (123)
|
Hi Kant, There are some errors in your logs, specially for Server 1, such as: Server 1: 2007-07-23 10:23:46,968 ERROR - ChannelUnit::SysMessage("[OPC Groups (OPCHN Unit #1)]![PartnerServer]: CoCreateInstance for server "OPCServer.WinCC" on machine WINCC_SRV2_KCR failed, Error=80040154 (HRESULT = 80040154 - REGDB_E_CLASSNOTREG (Class not registered))") Check this thread. 2007-07-23 02:00:34,906 ERROR Connectionerror 1 "RTU_S425": Errorcode 0xFFDF 410E! For this one, check these FAQ's out: 1 and 2. 255,25.07.2007,04:10:55:734,1007008,4,,WINCC_SRV1_KCR,SCRIPT,EndAct Timeout 255,24.07.2007,04:59:43:531,1003015,4,Administrator,WINCC_SRV1_KCR,Alarm Logging,AlarmLogging is being overloaded with 2446 messages / 10 min. Server 2: 2007-07-25 04:10:54,453 ERROR - ChannelUnit::SysMessage("[OPC Groups (OPCHN Unit #1)]![PartnerServer]: CoCreateInstance for server "OPCServer.WinCC" on machine WINCC_SRV1_KCR failed, Error=80040154 (HRESULT = 80040154 - REGDB_E_CLASSNOTREG (Class not registered))") 2007-07-25 04:11:04,453 ERROR ..FOPCData::InitOPC CoCreateInstanceEx- ERROR 80040154 Also.. check this out. Best regards, Danielle |