Tuesday, December 9, 2008

ESX Madness

Scenario: One application server (Windows Server 2003, IIS)in a VM on ESX 3.5 update 2. Works fine, connects to DB on physical machine over a gigabit link. The addition of a second application server causes connections to the common database to fail under load testing, from both machines. Tried: Looked at IRQ usage in /proc/vmware/interrupts. All seems to be distributing across all cores as expected. Earlier versions of ESX have issues with IRQ usage... see http://www.tuxyturvy.com/blog/index.php?/archives/37-Troubleshooting-VMware-ESX-network-performance.html Checking and optimizing code... only made the problem less frequent. Problem isolated: The problem turned out to be linked to the nic on the VM. The flexible adapter was failing under load for whatever reason. Sadly, this is the only adapter available for Windows Server 2003 Standard 32-bit in ESX 3.5. The Fix: Shut down one of the VMs. Go into the virtual machine's settings change the OS type to Server 2003 enterprise 64-bit. Save and exit the settings window. Re-enter settings and add a network card to the VM. You will see that the e1000 is now available. After installing the e1000 change the OS type back to Server 2003 standard 32-bit and start the VM. In Windows, go to Device Manager and select Properties for the new network adapter. Shut off all of the offloading for TCP checksumming, etc and move your network settings to the new adapter, disabling the old one in Windows and disconnecting it in the ESX VM settings. In summation: Yes, it's a little like duct tape. But it worked for me. And if you are in a pinch and can not afford the downtime to thoroughly examine a flailing production machine, this may work for you too. If it doesn't work for you, I am not responsible for data loss, downtime, etc... blah blah...

Wednesday, September 10, 2008

GoDaddy issues

Unfortunate customers of GoDaddy: If you have suddenly encountered abysmal performance (connect to Apache quickly, but takes 10+ seconds to begin receiving the web page) with your server in the last few days, do this: Look in your web server configs to see if you have name resolution enabled for your web logs. If you do, turn it off and see if that helps. GoDaddy provisioned one of my servers with 208.109.96.1 for resolution and the server is now unreachable. Call GoDaddy and get another DNS server to resolve against.

Friday, August 22, 2008

E911 and ANI/ALI tagging

Had a bunch of fun today with getting the correct number tagged onto our 911 calls. Note: Some telco providers override your call ID number with the primary number for your PRI when you call 911. This is a problem if you have a central VoIP phone system that services two or more separate physical locations. Before 911 starts ignoring you altogether because of the incessant test calls, get your telco to allow you to present your own callID number to E911.

Friday, July 25, 2008

Advice on use of Nessus

FYI: Don't run Nessus against a machine hosted at Godaddy. They drop traffic for a few mins for any IP that appears to be portscanning. Aside from slowing down Nessus, any requests for anything from the originating IP address (are you running it from behind your corporate firewall?) will be dropped. May be able to mitigate by changing scan settings, however...

Wednesday, July 23, 2008

Using Nagios to watch PRI usage

Here is a really quick (and even dirtier) way to quickly determine if you are overextending your PRI usage.

This is a really simple script for Nagios to run and report how many concurrent calls you have.

Fill in the blanks for NAMEOFHOST (CallManager hostname), COMMUNITYNAME (snmp read only community name for CallManager Express) and ifnums. ifnums are the integer values for IF-MIB::ifIndex.** where ** is the interface number for your PRI. you will have to do some snmpwalking to figure this out. I left some values in the script to show how to format your entries.

This script requires the check_ifoperstatus Nagios plugin to work. I will clean this up soon and re-post.

This script throws a critical warning if the PRI usage is 7 or more concurrent calls

#!/bin/sh # SNMP test for number of interfaces up in a particular range # export NAG_PLUG_PATH=/usr/lib/nagios/plugins export IF_STAT_PLUG=check_ifoperstatus export NAMEOFHOST=_____________ export COMMUNITY_NAME=____________ ifnums=( 23 24 25 26 27 28 29 30 31 32 ) CTR1=0 for interface in ${ifnums[@]} do VAR1=`$NAG_PLUG_PATH/$IF_STAT_PLUG -k $interface -H $NAMEOFHOST -C $COMMUNITY_NAME` if [ "${VAR1%*\:*\:*}" = "CRITICAL" ]; then VAR2=0 else CTR1=`expr $CTR1 + 1` fi done if [ "$CTR1" -ge "7" ]; then STATUSCODE=CRITICAL else STATUSCODE=OK fi echo "PRI $STATUSCODE - $CTR1 calls currently"

Liferay and Active Directory integration

This section was omitted from the recent Linux Pro Mag article on Liferay: ... HOME STRETCH!!! Now we have Liferay running in Tomcat and Apache is handing off web requests for Liferay to Tomcat using mod_jk. The next phase is Active Directory authentication. We will assume that you already have a Windows domain controller to tie in to. First, download and install Jxplorer from www.jxplorer.org Start Jxplorer and enter the following information into the connection dialog: Host: 192.168.25.128 Protocol: LDAPv3 Base DN: DC=testdomain DC=com Level: User + Password User DN: CN=liferay-access,CN=Users,DC=testdomain,DC=comPassword: liferay-access Replace the host IP address with your AD server's IP. Replace all instances of DC=testdomain,DC=com with your domain information. My Liferay user's name is liferay-access and the account resides within the users container directly under testdomain.com in AD. Modify your User DN accordingly. This was the most difficult part of the process for me when I first tried to get anything (non Microsoft) to use an AD server for LDAP authentication. The MMC snap-ins water down the technology to the point that you initially don't have to understand LDAP object naming conventions to get up and running, or to manage a small domain. The complication comes in when you need to do any LDAP binding from a non-Microsoft platform. Once you are able to successfully bind to your AD server using the Liferay account's credentials, write down the information you used. Next, we will backup the Liferay database in case we need to quickly restore our settings. SSH into the Liferay server and mysqldump lportal -u lportal -p >pre-ldap-dump.sql Next, we will create the portal-ext.properties to contain all of our LDAP settings. For the configuration options, I relied heavily on the Liferay user forums. If you run into issues, especially with LDAP, that is the first place to go looking for a solution. The portal-ext.properties is meant to override the settings in portal.properties. the portal.properties file resides under liferay-4.4.2/webapps/ROOT/WEB-INF/classes and contains the defults for a ton of liferay settings. Don't see the file? It only exists there in the event you built Liferay from source. Why? Who knows. You can find portal.properties in the source tree under portal-impl/classes . If you intend to tweak Liferay further, it would be a good idea to place the portal.properties file in the liferay-4.4.2/webapps/ROOT/WEB-INF/classes directory and copy the values you want to change to the portal-ext.properties file in the same directory. vi /opt/liferay/liferay-4.4.2/webapps/ROOT/WEB-INF/classes/portal-ext.properties ____see portal-ext.properties________ Tweak the information here to coincide with your LDAP settings. Again, thanks to the Liferay user forums for the great explanation for these values. Because the change we just made is to a Liferay configuration file, we must bounce the portal to see the results. service liferay restart If all is well, you should be able to log in using the test@liferay.com credentials and use the Directory Portlet to see the users and groups imported from Active Directory. watch the catalina.out file for errors on startup with tail -f /opt/liferay-4.4.2/logs/catalina.out _____________portal-ext.properties___________ ldap.factory.initial=com.sun.jndi.ldap.LdapCtxFactory ldap.base.provider.url=ldap://192.168.25.128:389 ldap.base.dn=dc=testdomain,dc=com ldap.security.principal=liferay-access ldap.security.credentials=liferay-access ldap.auth.enabled=true ldap.auth.required=false ldap.auth.method=bind ldap.auth.password.encryption.algorithm= ldap.auth.password.encryption.algorithm.types=MD5,SHA #ldap.auth.search.filter=(cn=@screen_name@) ldap.auth.search.filter=(mail=@email_address@) ldap.user.mappings=screenName=sAMAccountName\npassword=userPassword\nemailAddress=mail\nfirstName=givenName\nlastName=sn\njobTitle=title\ngroup=memberOf\nfullName=cn #ldap.group.mappings=groupName=cn\ndescription=description\nuser=uniqueMember ldap.group.mappings=groupName=cn\ndescription=description\nuser=member ldap.import.enabled=true ldap.import.on.startup=true ldap.import.interval=10 ldap.import.user.search.filter=(&(objectCategory=Person)(sAMAccountName=*)) ldap.import.group.search.filter=(objectCategory=Group) ldap.import.method=user #ldap.import.method=group ldap.export.enabled=false ldap.password.policy.enabled=false

Google Dropping Emails?

I have had a bunch of trouble with script-generated emails timing out and not being sent through Google Apps. Not really sure why this is happening, but this is how I am fixing it... and it seems to work so far.

If you are using the article I wrote in Linux Pro Magazine a month or so ago to deploy Liferay, do not set up Liferay to connect to Google directly to send mail.

Instead:
  1. Set up Sendmail as a relay on your Liferay application server (only accessible to localhost) and use the mail settings in Liferay (ROOT.xml) to send all outbound email through localhost. No password should be necessary (depending on how you set Sendmail up), but make sure that the username you are using does, in fact, exist in your Google Apps account.

  2. In Google Applications mail control panel, be sure to allow the public IP address your mail will be relayed from.

  3. Next, publish your SPF record if you have not already done so. see openspf.org for details. If your DNS provider does not offer txt records for your domain, move to someone who does or host your own public authoritative DNS.

    SPF adoption is growing and if you are not onboard you will be left behind with your communication being dropped into spam folders or altogether rejected. As of a few days ago, Network Solutions does not offer that service and do not have an ETA for when they will.

  4. Here is the hard part: if you consider these emails to be of critical importance you must contact Google (but only if you are a paying customer) and ask them nicely to remove spam filtering from the address you are using to send emails from Liferay.

Why does this work? Instead of having Liferay connect directly to Google's SMTP (meant for clients like Outlook), the messages are sent through a local relay that does not send to smtp.gmail.com. The relay looks up the MX records for your organization which should be pointed to Google's MXes (aspmxl.google.com etc...) to forward email. When you whitelisted the IP address in the GApps control panel, you were telling Google that you would be intentionally relaying email from that address. The SPF record essentially does the same thing for other mail servers. If there is a timeout situation getting the email from the relay to Google, the relay should continue to attempt to resend.

This documentation also goes for other technologies like Nagios that may generate a quantity of critically important email.

Best of luck... -Ash

Wednesday, December 5, 2007

Tracking inbound call volume on Cisco CallManager Express using Splunk

*** DISCLAIMER*** I am not responsible for damage to your equipment or downtime incurred by your following my instructions. Research this and be sure of what you are doing before you do it. ******************* Here's a neat way to get a quick (and dirty) view into how many calls your company is taking in through Cisco CallManager Express using Splunk. Assuming you already have a syslog server set up and working... Install Splunk on the syslog server and verify that it is working Set up the CallManager Express to send syslog entries to the syslog server (see http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ap04_:_syslog_Configuration_and_Cisco_Devices AND page 11-6 in http://www.cisco.com/univercd/cc/td/doc/product/voice/its/cmesrnd/managcme.pdf ) Telnet into the CME and do a show run. Look at your voice translation rules. The first number in each rule will appear in log entries as cdn:#### for inbound calls. For example, if you have voice translation-rule 5 rule 1 /5202/ /1007/ rule 2 /5203/ /1006/ rule 3 /5204/ /1007/ rule 4 /5205/ /1007/ rule 5 /5206/ /1007/ rule 6 /5207/ /1009/ rule 7 /5208/ /1006/ rule 8 /5209/ /1007/ rule 9 /5210/ /1008/ rule 10 /5211/ /1007/ and someone calls in using 555-5204, the log entry should contain cdn:5204 . You can use this to build some saved searches that will show your call traffic by DID by hour, minute, or whatever. Cheers, -Ash

Tuesday, November 6, 2007

Liferay LDAP issues

Using Liferay 4.3.3 in Tomcat, authenticating against MS Active Directory. I was having a strange problem... only one user would not authenticate, and was throwing "...Problem accessing LDAP server: Unprocessed Continuation Reference..." All accounts were importing and working perfectly. Using JXplorer and comparing accounts, I found that the account in question did not have an email address associated with it. Without an email address, Liferay will not authenticate... at least with the typical config.

Monday, August 13, 2007

ANI failure with E911 and Cisco Call Manager Express

Found out something really interesting... Typically ANI will only resolve if your outbound 911 calls have the billing phone number attached... so if your location has numerous DIDs, tag the outbound calls to 911 with the billing phone number. Also, if you have trouble faxing to some numbers (weird errors, etc...) it is likely that you are being blocked because of ANI funnies as well. I think some telcos block non-resolving numbers to prevent fax spam.

Wednesday, August 8, 2007

Cisco integration craziness

DISCLAIMER: I am NOT responsible for what you do with this information. If you take my advice and screw up, it's on YOU, bud. I'm also not responsible for what may happen to any warranty or service contract you have on your machine if you follow these instructions. Setup: Cisco CallManager Express + Cisco Unity Connection Problem: Corporate management wants to monitor all incoming calls for all inbound numbers using Unity Connection's Call Manager Traffic report... but they want the customer to reach a real human first, without hearing anything pre-recorded. Caveat: If you've dealt with Unity Connection, you know that you can pass a call through a call handler with a blank greeting, but just before the transfer occurs the call handler tells the customer that their call is being transferred. According to Cisco there is no 'supported' way to disable that little voice prompt. Solution: Every system voice prompt is stored on the Unity Connection server in WAV files in subdirectories under G:\Cisco Unity Connection\TuiResources\Prompts\ENU\G711\ or G:\Cisco Unity Connection\TuiResources\Prompts\ENU\G729\ So, I replaced the unwanted WAV file with a blank one and Voila! The customer gets passed through the call handler, gets tallied for reporting purposes and rings the sales phones directly. Here's how to find the file: search the G711 directory for files named prompt.ini containing the text that is spoken in the file you seek. Dig around in the results to find a reference to the offending file (may take a while). Copy the file to a network share if you are using Remote Desktop to access the machine... DO NOT PLAY IT IN A REMOTE DESKTOP SESSION!!! Playing audio in an RD session can ruin audio settings throughout the Unity Connection software package. Test the WAV to make sure it's what you want and rename the original to {original filename}.wav.OLD. Copy a blank WAV file into the same directory and name it so as to replace the original, unwanted file. What I was looking for in particular was ...\G711\AvPHGreet\AvPHGreetENU005.wav . The file I replaced it with was originally named AvAddrSearchENU003.wav (under another folder) No server restart was required and the system now works beautifully!