Wednesday, June 10, 2009

Purge old files on Linux/Unix using “find” command

I've noticed that one of our interface directories has a lot of old files, some of them were more than a year old. I checked it with our implementers and it turns out that we can delete all files that are older than 60 days.

I decided to write a (tiny) shell script to purge all files older than 60 days and schedule it with crontab, this way I won't deal with it manually. I wrote a find command to identify and delete those files. I started with the following command:

find /interfaces/inbound -mtime +60 -type f -maxdepth 1 -exec rm {} \;

It finds and deletes all files in directory /interface/inbound that are older than 60 days.
"-maxdepth 1" -> find files in current directory only. Don't look for files in sub directories.

After packing it in a shell script I got a request to delete "csv" files only. No problem... I added the "-name" to the find command:

find /interfaces/inbound -name "*.csv" -mtime +60 -type f -maxdepth 1 -exec rm {} \;

All csv files in /interface/inbound that are older than 60 days will be deleted.

But then, the request had changed, and I was asked to delete "*.xls" files further to "*.csv" files. At this point things went complicated for me since I'm not a shell script expert...

I tried several things, like add another "-name" to the find command:

find /interfaces/inbound -name "*.csv" -name "*.xls" -mtime +60 -type f -maxdepth 1 -exec rm {} \;

But no file was deleted. Couple of moments later I understood that I'm trying to find csv files which is also xls files... (logically incorrect of course).

After struggling a liitle with the find command, I managed to make it works:

find /interfaces/inbound \( -name "*.csv" -o -name "*.xls" \) -mtime +60 -type f -maxdepth 1 -exec rm {} \;

:-)

Aviad

Wednesday, May 20, 2009

Upgrade Java plug-in (JRE) to the latest certified version

If you have already migrated to Java JRE with Oracle EBS 11i you may want to update EBS to the latest update from time to time. For example, if your EBS environment is configured to work with Java JRE 6 update 5 and you want to upgrade your clients with the latest JRE 6 update 13.

This upgrade process is very simple:

  1. Download the latest Java JRE installation file
    The latest update can be downloaded from here.
    Download the "JRE 6 Update XX" under "Java SE Runtime Environment".
     
  2. Copy the above installation file to the appropriate directory:
    $> cp jre-6uXX-windows-i586-p.exe $COMMON_TOP/util/jinitiator/j2se160XX.exe

    We have to change the installation file name by the following format:   "j2se160XX.exe"  where XX indicates the update version.
     
  3. Execute the upgrade script:
    $> cd $FND_TOP/bin
    $> ./txkSetPlugin.sh 160XX

That's all....

Since we upgraded our system to JRE 6 update 13 (2 weeks ago), our users don't complain about mouse focus issues and some other forms freezes they have experienced before. So... it was worth it...

If you haven't migrated from Jinitiator to the native Sun Java plug-in yet, it's highly recommended to migrate soon. Jinitiator is going to be desupported soon.

See the following post for detailed, step by step, migration instructions: Upgrade from Jinitiator 1.3 to Java Plugin 1.6.0.x.

You are welcome to leave a comment.

Aviad

Tuesday, March 17, 2009

Corruption in redo log file when implementing Physical Standby

Lately I started implementing Data Guard - Physical Standby - as a DRP environment for our production E-Businsess Suite database and I must share with you one issue I encountered during implementation.

I chose one of our test environments as a primary instance and I used a new server, which was prepared to the standby database in production, as the server for the standby database in test. Both are Red-Hat enterprise linux 4.

The implementation process went fast with no special issues (at lease I thought so...), everything seems to work fine, archived logs were transmitted from the primary server to the standby server and successfully applied on the standby database. I even executed switchover to the standby server (both database and application tier), and switchover back to the primary server with no problems.

The standby database was configured for maximum performance mode, I also created standby redo log files and LGWR was set to asynchronous (ASYNC) network transmission.

The exact setting from init.ora file:

log_archive_dest_2='SERVICE=[SERVICE_NAME] LGWR ASYNC=20480 OPTIONAL REOPEN=15 MAX_FAILURE=10 NET_TIMEOUT=30'

At this stage, when the major part of the implementation had been done, I found some time to deal with some other issues, like interfaces to other systems, scripts, configure rsync for concurrent log files, etc... , and some modifications to the setup document I wrote during implementation.

While doing those other issues, I left the physical standby instance active so archive log files are transmitted and applied on the standby instance. After a couple of hours I noticed the following error in the primary database alert log file:

ARC3: Log corruption near block 146465 change 8181238407160 time ?
Mon Mar  2 13:04:43 2009
Errors in file [ORACLE_HOME]/admin/[CONTEXT_NAME]/bdump/[sid]_arc3_16575.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 146465 change 8181238407160 time 02/03/2009 11:57:54
ORA-00312: online log 3 thread 1: '[logfile_dir]/redolog3.ora'
ARC3: All Archive destinations made inactive due to error 354
Mon Mar  2 13:04:44 2009
ARC3: Closing local archive destination LOG_ARCHIVE_DEST_1: '[archivelog_dir]/arch_[xxxxx].arc' (error 354)([SID])
Committing creation of archivelog '[archivelog_dir]/arch_[xxxxx].arc' (error 354)
ARCH: Archival stopped, error occurred. Will continue retrying
Mon Mar  2 13:04:45 2009
ORACLE Instance [SID] - Archival Error

I don't remember if I've ever had a corruption in redo log file before... 
What is wrong?! Is it something with the physical standby instance ?? Actually, if it's something with the standby instance I would have expected for a corruption in the standby redo log files not the primary's..

The primary instance resides on a Netapp volume, so I checked the mount option in /etc/fstab but they were fine. I asked our infrastructure team to check if something went wrong with the network during the time I got the corruption, but they reported that there was no error or something unusual.

Ok, I had no choice but to reconstruct the physical standby database, since when an archive log file is missing, the standby database is out of sync'. I set the 'log_archive_dest_state_2' to defer so no further archive log will be transferred to the standby server, cleared the corrupted redo log files (alter database clear unarchived logfile 'logfile.log') and reconstruct the physical standby database.

Meanwhile (copy database files takes long...), I checked documentation again, maybe I missed something, maybe I configured something wrong.. I have read a lot and didn't find anything that can shed some light on this issue.

At this stage, the standby was up and ready. First, I held up the redo transport service (log_archive_dest_state_2='defer') to see if I'll get a corruption when standby is off.  After one or two days with no corruption I activated the standby.

Then I saw the following sentence in Oracle® Data Guard Concepts and Administration 10g Release 2 (10.2):
"All members of a Data Guard configuration must run an Oracle image that is built for the same platform. For example, this means a Data Guard configuration with a primary database on a 32-bit Linux on Intel system can have a standby database that is configured on a 32-bit Linux on Intel system"

One moment, I thought to myself, the standby server is based on AMD processors and the primary server is based on Intel's..    Is it the problem?!
When talking about same platform, is the meaning same processors also? Isn't it sufficient to have same 32 bit OS on x86 machines?
Weird but I had to check it...

Meanwhile, I got a corruption in redo log file again which assured there is a real problem and it wasn't accidentally.

So I used another AMD based server (identical to the standby server) and started all over again – primary and standby instances. After two or three days with no corruption I started to believe the difference in the processors was the problem. But one day later I got a corruption again (Oh no…)

I must say that on the one hand I was very frustrated, but on the other hand it was a relief to know it's not the difference in the processors.
It was so clear that when I'll find out the problem it will be something stupid..

So it is not the processors, not the OS and not the network.  What else can it be?!

And here my familiarity with the "filesystemio_option" initialization parameter begins (thanks to Oracle Support!). I don't know how I missed this note before, but all is written here - Note 437005.1: Redo Log Corruption While Using Netapps Filesystem With Default Setting of Filesystemio_options Parameter.

When the redo log files are on a netapp volume, "filesystemio_options" must be set to "directio" (or "setall"). When "filesystemio_options" is set to "none" (like my instance before), read/writes to the redo log files are using the OS buffer cache. Since netapp storage is based on NFS (which is stateless protocol), when performing asynchronous writing over the network, the consistency of writes is not guaranteed. Some writes can be lost. By setting the "filesystemio_options" to "directio", writes bypasses the OS cache layer so no write will be lost.

Needless to say that when I set it to "directio" everything was fine and I haven't gotten any corruption again.

Aviad

Tuesday, March 10, 2009

JRE Plug-in “Next-Generation” – Part II

In my last post "JRE Plug-in “Next-Generation” – to migrate or not?" I wrote about a Forms launching issue in EBS right after upgrading JRE (Java Plug-in) to version 6 update 11 which works with the new next-generation Java Plug-in architecture. The problem happens inconsistently and it only works when I disable the "next-generation Java Plug-in".

Following a SR I've opened to Oracle support about this issue, I was being asked to verify that the profile option "Self Service Personal Home Page Mode" is set to "Framework Only".

We have this profile option set to "Personal Home Page" as our users prefer this way to the "Framework Only" way.

It's important to note that "Personal Home Page" is not a supported value for the "Self Service Personal Home Page Mode" profile option and may cause unexpected issues.

After setting the profile option to "Framework Only" the problem has resolved and the screen doesn't freezes anymore.

So the solution in my case was to set the profile option "Self Service Personal Home Page Mode" to "Framework Only" (we are still testing it but it look fine so far), however there are two more options that seems to work, even when the profile option set to "Personal Home Page" and "next generation Java Plug-in" is enabled.

1) Uncheck "Keep temporary files on my computer"
- Navigate to Java console (start -> settings -> Control Panel -> Java,  or start -> run -> javacpl.cpl)
- On General tab -> Temporary Internet Files -> Settings -> uncheck the "Keep temporary files on my computer".
- Check the issue from a fresh IE session.
 

I'm not sure how or why, but it solves the problem, no more freezing this way..

2) Set “splashScreen” to null
- Edit $FORMS60_WEB_CONFIG_FILE file in your Forms server node.
- Change this line
"splashScreen=oracle/apps/media/splash.gif"
to
"splashScreen="

- No need to bounce any service.
- Check the issue from a fresh IE session.

Again, it's not so clear how or why, but it solves the problem as well.

Now, we just need to convince our users to accept the "framework only" look and feel, and then we would consider upgrading all our clients to the new next-generation Java Plug-in.

You are welcome to leave a comment or share your experience with the new Java Plug-in.

Aviad

Wednesday, February 18, 2009

JRE Plug-in “Next-Generation” – to migrate or not?

It has been more than half a year since we've migrated from Oracle Jinitiator to Sun Java JRE Plug-in (Java 6 update 5) in our Oracle Applications (EBS) system, and I must say, I'm not satisfied yet.

For the first months we had been struggling with a lot of mouse focus bugs which have made our users very angry about this upgrade. Although we've applied some patches related to this bugs, we still have some with no resolution.
Upgrading to Developer 6i patchset 19 has solved some bugs but not all of them.

As part of an SR we had opened about mouse focus issue, we was advised by Oracle to install the latest Java JRE (Java 6 update 12 this days) as a possible solution for the remaining bugs.

Starting with Java 6 update 10, Sun has introduced the new "next-generation Java Plug-in", which makes troubles with Oracle EBS. You can read more about this new architecture at Sun Java site - "What is next-generation Java Plug-in".

Right after installing Java 6 update 11, I encountered a problem - when trying to open forms the screen freezes.


The browser window hangs inconsistently. I have no idea when it's going to be opened and when it's not. I've tried Java 6 update 12 and it's the same – sometimes it opens and sometimes it doesn’t. No matter what I did - clear java cache on client, clear Apache cache, install JRE in different directory (in case you have installed previous update of version 6), uninstall previous versions of Java Plug-in installed on same pc, I tried with explorer 6 and 7 - the problem wasn't resolved.

There is an unpublished opened bug for this problem: Bug 7875493 - "Application freezes intermittently when using JRE 6U10 and later". I've been told by Oracle support that they have some incompatibilities with the new next-generation architecture and that they are working with Sun about it.

Meanwhile there are 2 workarounds: (the second doesn't work for me but suggested by Oracle support)

1) Disable the "next generation Java Plug-in" option:
Go to Control Panel -> Java -> Select the "Advanced" tab -> expand the "Java Plug-in" -> uncheck the "Enable the next-generation Java Plug-in" option.
 

This workaround always works (at least for me...).

2) Set the swap file to system managed + Tune the heap size for java:
- Go to Control Panel -> System -> Select the "Advanced" tab -> click on Settings (in Performance frame) -> Select the "Advanced" tab -> Click on Change -> Select the "System managed size" option.

- Go to Control Panel -> Java -> Select the "Java" tab -> Click "View..." (in Java Applet Runtime Settings frame) -> update the "Java Runtime Parameters" field with: "-Xmx128m -Xms64m".

This workaround doesn't work for me.

For now, I've decided to stay with the "old" Java Plug-in 6 update 5 and do not upgrade our users to the new next-generation Java Plug-in. I Hope the following updates of Java Plug-in will be better or Oracle will publish a patch to solve this problem.

I’ll keep update as soon as I have more info’.

Aviad

Thursday, January 29, 2009

How to enable trace for a CRM session

I was being asked to examine a performance issue within one of our CRM application screens, after some users complained about a specific long time action.

First thing, I tried to enable trace for the CRM session, but It turned out that it’s definitely not simple to identify a CRM session. Especially in my case, when a session opens two (sometimes more) database sessions. It’s quite impossible actually.

So how it is possible to trace those CRM sessions anyway?

Oracle has provided an option to execute custom code for every session opened in the database through a system profile. This profile called “Initialization SQL Statement - Custom” (the short name is 'FND_INIT_SQL') and allows customize sql/pl*sql code.

Once setting this profile in user level, each session opened for this user will first execute the code within the profile. No matter which type of activity the user does – Forms, CRM, Concurrent request, or anything else that opens a database session – the content of this profile will be executed.

So, clearly we can use this capability to enable Trace for users sessions.

Steps to enable trace for specific user:

  1. Login with “Application Developer” responsibility
  2. Open the “Create Profile” form –> Query the profile “FND_INIT_SQL”
  3. Make sure that “visible” and “updateable” are checked in user level.

     
  4. Switch responsibility to “System Administrator”
  5. Navigate to Profile –> System –> Query the profile “Initialization SQL Statement - Custom” in user level for the user we would like to enable trace for.

     
  6. Update the profile option value in user level to the following:

    BEGIN FND_CTL.FND_SESS_CTL('','', '', 'TRUE','','ALTER SESSION SET TRACEFILE_IDENTIFIER='||''''||'AVIADE' ||''''||' EVENTS ='||''''||' 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 '||''''); END;
       
    ** Just replace AVIADE with the user you enable trace for.

      
  7. Now, after the user logout from the application (the user you enabled trace for), the user can login and reproduce the issue.
     
  8. When finish to reproduce the issue, you should disable the trace by clearing the profile option value and update it to NULL. (profile “Initialization SQL Statement – Custom” of course..)
  9. The trace file/s will wait for you in your udump (user_dump_dest init’ parameter) directory.

Since I enabled and disabled the trace quite a few times while investigating my performance issue, I wrote these handy simple programs which enable and disable the trace for a user in a quick and easy manner.

Execute this program to enable trace for a specific user: (substitute step 6 above)

DECLARE
  l_ret     boolean;
  l_user_id number;
BEGIN

  select user_id
    into l_user_id
    from fnd_user
   where user_name = '&&USER_NAME';

  l_ret := fnd_profile.SAVE(X_NAME        => 'FND_INIT_SQL',
                            X_VALUE       => 'BEGIN FND_CTL.FND_SESS_CTL('''','''','''', ''TRUE'','''',''ALTER SESSION SET TRACEFILE_IDENTIFIER=''||''''''''||''&&USER_NAME'' ||''''''''||'' EVENTS =''||''''''''||'' 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 ''||''''''''); END;',
                            X_LEVEL_NAME  => 'USER',
                            X_LEVEL_VALUE => l_user_id);
  commit;

  dbms_output.put_line('Profile has updated successfully');

EXCEPTION
  when others then
    dbms_output.put_line('Failed to update the profile: '||sqlerrm);
END;

 

Execute this program to disable trace for a specific user: (substitute step 8 above)

DECLARE
  l_ret     boolean;
  l_user_id number;
BEGIN

  select user_id
    into l_user_id
    from fnd_user
   where user_name = '&USER_NAME';

  l_ret := fnd_profile.DELETE(X_NAME        => 'FND_INIT_SQL',
                              X_LEVEL_NAME  => 'USER',
                              X_LEVEL_VALUE => l_user_id);
  commit;

  dbms_output.put_line('Profile has erased successfully');

EXCEPTION
  when others then
    dbms_output.put_line('Failed to erase the profile: '||sqlerrm);
END;

Hope you find it helpful…
Feel free to leave a comment or share your thought about this issue.

Aviad