t-shoot SSL connection: problem F5

1, Get ssl debug log by:
On F5 cli:
#modify /sys db log.ssl.level value Warning
#tail -f /var/log/ltm
On client side:
#openssl s_client -connect -key client1.key -cert client1.crt
GET / HTTP/1.0

Check log and find error number

2, Add irule in the virtual server to get more information about client cert verification:
log LOCAL0.debug “nbr certs: [SSL::cert count] verifyResult: [SSL::verify_result] // [X509::verify_cert_error_string [SSL::verify_result]]”
set i 0
while {$i < [SSL::cert count]} {
log LOCAL0.debug "[X509::subject [SSL::cert $i]]"
incr i

3, ssldump for trouble ssl session
a) use find to find the path of keyfile
[user@xxx:/S1-green-P:Active:In Sync] / # find -iname *.key*
for example
b) ssldump -A -d -k -n -i
-A Print all fields
-d Show application data when private key is provided via -k
-k Private key file, found in /config/ssl/ssl.key/; the key file can be located under client SSL profile
-n Do not try to resolve PTR records for IP addresses
-i The capture VLAN name is the ingres VLAN for the TLS traffic
For example:
[user@xxx:/S1-green-P:Active:Changes Pending] / # ssldump -A -k ./config/filestore/files_d/partition_d/certificate_key_d/:partition:1410ws.verifiering.hsa.sjunet.org.key_98094_1 -i 0.0 host and port 443

key file is not needed to be specified when we only want to check ssl handshake information, not application data:
[user@xxx:/S1-green-P:Active:Changes Pending] / # ssldump -ni 0.0 host and port 443


F5 vCMP upgrade summary

Process of software upgrading:
1, Sync of HA, then reactive license on each node(license should be reactivated in vCMP host when upgrading either vCMP host of vCMP guest)
2, backup and save UCS for each node
3, upload and install new image (new image can be uploaded in vCMP host only,vCMP guest has access to the image in vCMP host, but installation need to be done on each node)
4, reboot
5, check HA status, it may shows “disconnected” because software version do not match on the peer nodes.But failover should work.
6, failover
7, upgrade the previously active node and reload
8, Check HA status, it should be connected and requires sync
9, Sync

1, in vCMP solution, you may upgrade vCMP host first or vCMP guest first. Which one goes first does not really matter. Check software support matrix for the compatible vCMP host version and guest version

failover of F5 LTM


1, Normally we use HA group (fast failover) because failover when using VLAN fail-safe or Gateway fail-safe will take about 10 secs. HA group failover happens almost immediately.

2, We are using version 11.6 and I have found that we need change failover method (in traffic group) to HA group in order to make HA group failover works.
You may check HA score with command show /sys ha-group

When you have failover method as HA order configured, it shows like this:
LB(Active)(/Common)(tmos)# show /sys ha-group detail

Sys::HA Group: lb01-ha
State enabled
Active Bonus 10
Score 0

| Sys::HA Group Trunk: nko-lb01-ha:lb-trunk
| Threshold 1
| Percent Up 100
| Weight 20

HA group score is always 0, no failover will happen even if you shutdown the trunk. When you change failover method to HA group, then it shows as below:
LB(Active)(/Common)(tmos)# show /sys ha-group

Sys::HA Group: lb01-ha
State enabled
Active Bonus 10
Score 20

| Sys::HA Group Trunk: nko-lb01-ha:lb-trunk
| Threshold 1
| Percent Up 100
| Weight 20
| Score Contribution 20

3, HA failover unicast configuration
Always you need configure 2 ips in order to make failover works: MGMT IP and failover IP. Especially failover IP is in a dedicated failover link among LTM nodes.
Removing mgmt IP will cause both LTM nodes switch to active statue even failover ip is configured and reachable. Removing failover IP will cause the same issue even if the mgmt ip is configured and reachable.

Sync and mirror ip can be configured as failover IP only, mgmt ip is not necessary here.

4, What will triger failover?

refer to above link:

The BIG-IP system initiates failover according to any of several events that you define. These events fall into these categories:

System fail-safe
With system fail-safe, the BIG-IP system monitors various hardware components, as well as the heartbeat of various system services. You can configure the system to initiate failover whenever it detects a heartbeat failure.
Gateway fail-safe
With gateway fail-safe, the BIG-IP system monitors traffic between an active BIG-IP system in a device group and a pool containing a gateway router. You can configure the system to initiate failover whenever some number of gateway routers in a pool of routers becomes unreachable.
VLAN fail-safe
With VLAN fail-safe, the BIG-IP system monitors network traffic going through a specified VLAN. You can configure the system to initiate failover whenever the system detects a loss of traffic on the VLAN and the fail-safe timeout period has elapsed.
HA groups
With an HA group, the BIG-IP system monitors trunk, pool, or cluster health to create an HA health score for a device. You can configure the system to initiate failover whenever the health score falls below configurable levels.
When you enable auto-failback, a traffic group that has failed over to another device fails back to a preferred device when that device is available. If you do not enable auto-failback for a traffic group, and the traffic group fails over to another device, the traffic group remains active on that device until that device becomes unavailable.

5, failover methods:

refer to link https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip-device-service-clustering-admin-11-5-0/8.html

  • Select Load Aware when the device group contains heterogeneous platforms and you want to ensure that a traffic group fails over to the device with the most capacity at the moment that failover occurs.
  • Select HA Order to cause the traffic group to fail over to the first available device in the Failover Order list.
  • Select HA Group to cause the BIG-IP system to trigger failover based on an HA health score for the device.

F5 reset tshoot

The following causes are those of the most generous causes that clients get reset from F5:

1, retransmission 5 times + timeout, reset

2, If F5 does not support any of the SSL versions/ciphers client wants to use, F5 would respond with TCP/RST immediately with reset.

3, ssl handshake timeout by default 10 secs

4,Application caused reset.The simplest is when you close the socket, and then write more data on the output stream. By closing the socket, you told your peer that you are done talking, and it can forget about your connection. When you send more data on that stream anyway, the peer rejects it with an RST to let you know it isn’t listen
5, one arm scenario, vip need have snat configured in case the backend server has default gw bypass f5, it that case, f5 connection towards backend server will timeout, after that f5 will send reset to client side

6, following item5, if automap is configured,  source is translated to self IP on egress interface heading toward servers, if no self ip on that vlan configured on f5, f5 will send reset packet.

7, The Server SSL profile Secure Renegotiate setting is set to Require or Require Strict. The back-end SSL server lacks support for the Transport Layer Security (TLS) Renegotiation Indication Extension

8, HTTP header size exceeded by server

9, HTTP header size exceeded by client

10, When an existing client-side connection has been detached from the server-side connection and reselects a new server, the BIG-IP system sends a TCP RST to the server to close the existing server-side connection. This behavior typically comes from using iRule commands such as LB::reselect.

11, No route to host

12, The BIG-IP system receives a SYN for either one of the following conditions:

  • A virtual server of type reject
  • A port that is protected by the Port Lockdown settings on a self IP address