When writing cgi scripts for web pages, text that presented to your script is preprocessed by the web server (such as Apache) and special characters must be translated so they can be included in the URL line and QUERY_STRING variable. Spaces are converted to “+“, variable assignments are separated with “&” and other than alphanumeric characters, the rest are converted to hex using % as a flag. For example: + is %2B, & is %26, J is %4A. Translating these back to their original strings can be a challenge so here is a function that will take any string (typically$QUERY_STRING from a web page form) and translate the string back to the original characters. You would call it with some inline code to extract each variable:
function ConvertHTML
{
# Provide an HTML query string with %hex and +
# characters and echo back original characters
# translate + to space
OLD=”$(echo $@ | tr “+” ” “)”
LEN=${#OLD}
NEW=””
# translate %## to a single char
# Perl handles the hex to ASCII conversion
CNT=1
while [ $CNT -le $LEN ]
do
CHR=”$(echo “$OLD” | cut -c $CNT)”
if [ “$CHR” = “%” ]
then
CNT=$(( CNT+1 ))
HEX=”$(echo “$OLD” | cut -c $CNT-$((CNT+1)))”
CHRHEX=”$(echo $HEX |
perl -nle ‘print join “”, map { chr hex $_ } split ” “;’)”
NEW=”$NEW$CHRHEX”
CNT=$(( CNT+1 ))
else
NEW=”$NEW$CHR”
fi
CNT=$(( CNT+1 ))
done
echo “$NEW”
return
}
## Process $QUERY_STRING (from a web form referral) into
## variable assignments
for PARM in $(echo $QUERY_STRING | tr “&” “n” | tr “;” “n”)
do
VAR=”$(ConvertHTML $PARM)”
VARNAME=”$(echo “$VAR” | cut -f 1 -d =)”
VALUE=”$(echo “$VAR” | cut -f 2- -d =)”
eval “$VARNAME=”$VALUE””
HTMLSTRING=$(echo “$PARM” | cut -f 2- -d =)
eval “${VARNAME}_HTML=”$HTMLSTRING””
done
The above loop will assign two variables: each one named in the QUERY_STRING and the same variable name with “_HTML” appended. That way, the original string can be available as well as the translated HTML string. For example, an HTML document used SUBMIT to call a cgi script, and the QUERY_STRING looks like:
USERNAME=abc&COMPANY=%23company+name%23
The variable names are USERNAME and COMPANY. The & separates each variable assignment, + is for spaces and %23 is the # character. This is the result of the above code:
USERNAME = “abc”, USERNAME_HTML = “abc”
COMPANY = “#company name#”, COMPANY_HTML = “%23company+name%23”
– See more at: http://serviceitdirect.com/blog/translating-html-characters#sthash.gEhxVCZM.dpuf
Tags: HP-UX