Just the other day we had a new BizTalk application move into production. However, shortly after, problems began where the orchestration that was running encountered an error. I opened up Health and Activity Tracking (HAT), and saw this:
This error was found in the suspended orchestration and in the Application Log:
| System.Data.OracleClient.OracleException: ORA-01017: invalid username/password; logon deniedat System.Data.OracleClient.OracleException.Check(OciErrorHandle errorHandle, Int32 rc) at System.Data.OracleClient.OracleInternalConnection.OpenOnLocalTransaction(String userName, String password, String serverName, Boolean integratedSecurity, Boolean unicode, Boolean omitOracleConnectionName) at System.Data.OracleClient.OracleInternalConnection..ctor(OracleConnectionString connectionOptions) at System.Data.OracleClient.OracleConnectionFactory.CreateConnection(DbConnectionOptions options, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningObject) at System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnection owningConnection, DbConnectionPool pool, DbConnectionOptions options) at System.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject) etc. |
Because HAT showed a send shape as being the last thing to start execution, I immediately assumed there was a problem with the Oracle Send Port associated with that send shape. A teammate, also trying to solve the problem, assumed the same. We were both frustrated because this problem hadn’t happened in dev or test – what was different now? We tried many things to try and get the production problem solved, but our efforts were in vain. The system ended up being rolled back and the old infrastructure put back in place – a true disaster.
A day later, after some of the pain and sorrow subsided, I contacted Microsoft. I was put in contact with a great support person, who looked at the problem and started asking some questions. I was a little frustrated with his questions at first, but now I know why he was asking them. He kept asking about database connections we were opening in C# code, as opposed to what seemed obvious to me, that this was related to the Oracle Adapter Static One-Way Send Port. He eventually explained his stubbornness for not accepting the problem at face value: 1) the error didn’t indicate anything about the Oracle Adapter and 2) HAT is not always accurate! He mentioned that he has seen many instances where HAT does not actually show the exact node where the problem is occurring; furthermore, he has even seen HAT debug values to be incorrect!
I couldn’t believe what I was hearing! Once he said this, I started to wonder… well, if the Oracle Adapter send port is not the problem (which would make sense since an earlier call to the same database via a solicit-response Oracle send port had worked), what could it be? I began examining the next node after the send, and found the node SiebelUpdate, which was trying to update an Oracle database via C# code. Immediately things started to click – right before deployment, I had been asked to use a new username/password for the Oracle Siebel connection. I had tested all of the previous credentials to avoid this kind of problem (there were 4 data sources being connected to in about 7 or 8 different ways), but I hadn’t tested the new one that had been given to me. I tried opening up the Siebel database using the credentials I had been given, and guess what, same error.
So here’s what I learned:
1. Don’t trust HAT. If we would have known this, I’m pretty sure we’d have figured this out prior to the rollback.
2. Look carefully at the error message – careful inspection showed that it wasn’t related to the Oracle Adapter.
3. Microsoft DOES use the ODBC connection for the Oracle Adapter (we had enabled logging but didn’t see anything – now I know why).
4. Don’t mistrust good old Oracle errors. I had, because there seemed to be no another explanation based on what was shown in HAT.




RSS Feed