Archive for the Systems Category

An issue with php70-php-fpm Perl PCRE causing some segfaults every once in a while.

Current pcre/jit details from phpinfo:

# php -r "phpinfo();" |egrep -i "pcre|jit"
auto_globals_jit => On => On
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 8.32 2012-11-30
PCRE JIT Support => enabled
pcre.backtrack_limit => 1000000 => 1000000
pcre.jit => 0 => 0
pcre.recursion_limit => 100000 => 100000

Here’s the php.ini section (/etc/opt/remi/php70/php.ini)

 930 [Pcre]
 931 ;PCRE library backtracking limit.
 932 ; http://php.net/pcre.backtrack-limit
 933 ;pcre.backtrack_limit=100000
 934 pcre.backtrack_limit=2000000
 935 ; Increased from 1,000,000 (default) to 2,000,000
 936 
 937 ;PCRE library recursion limit.
 938 ;Please note that if you set this value to a high number you may consume all
 939 ;the available process stack and eventually crash PHP (due to reaching the
 940 ;stack size limit imposed by the Operating System).
 941 ; http://php.net/pcre.recursion-limit
 942 ;pcre.recursion_limit=100000
 943 pcre.recursion_limit=1000000
 944 ; Increased from 100,000 to 1,000,000
 945 
 946 ;Enables or disables JIT compilation of patterns. This requires the PCRE
 947 ;library to be compiled with JIT support.
 948 ;pcre.jit=0
 949 pcre.jit=1
 950 ; STK 051717 Enabled pcre.jit.  It's compiled in and looks like it should help perf.
 951 ; php -r "phpinfo();" |egrep -i "pcre|jit"

After making that change, restart php70-php-fpm and you should be good with the new values set:

# php -r "phpinfo();" |egrep -i "pcre|jit"
auto_globals_jit => On => On
pcre
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 8.32 2012-11-30
PCRE JIT Support => enabled
pcre.backtrack_limit => 2000000 => 2000000
pcre.jit => 1 => 1
pcre.recursion_limit => 1000000 => 1000000

Still, need to find out what is causing the PCRE to cause a stack overflow.

EDIT: Need to double-check SELinux

If you want to allow httpd to execmem, then you must tell SELinux about this by enabling the ‘httpd_execmem’ boolean.

# setsebool -P httpd_execmem 1

You can generate a local policy module to allow this access.  You can allow this access for now by executing:

# ausearch -c 'php-fpm' --raw | audit2allow -M my-phpfpm
# semodule -i my-phpfpm.pp
Additional Information:
Source Context system_u:system_r:httpd_t:s0
Target Context system_u:system_r:httpd_t:s0
Source php-fpm

Raw Audit Messages:
type=AVC msg=audit(1495071733.385:311998): avc: denied { execmem } for pid=989 comm="php-fpm" scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:system_r:httpd_t:s0 tclass=process

type=SYSCALL msg=audit(1495071733.385:311998): arch=x86_64 syscall=mmap success=no exit=EACCES a0=0 a1=10000 a2=7 a3=22 items=0 ppid=984 pid=989 auid=4294967295 uid=99 gid=99 euid=99 suid=99 fsuid=99 egid=99 sgid=99 fsgid=99 tty=(none) ses=4294967295 comm=php-fpm exe=/opt/remi/php70/root/usr/sbin/php-fpm subj=system_u:system_r:httpd_t:s0 key=(null)

Hash: php-fpm,httpd_t,httpd_t,process,execmem

 

L4 or L7 for Load balancing?

For web services, go the L7 route and route all the traffic through the LB (as a default route) to get the benefits.  With DNS, I chose to use the L4 route.. each with their different configurations.

Testing the KEMP LoadMaster in a config similar to this:

Internet --> Firewalls --> KEMP LM (1.252) --> VIP (1.250) on SVR(s)

Make sure your services include the VIP to bind to.

LoadMaster Sample Config – Layer 7
Service Type: HTTP/HTTPs (example)
Force L7: Yes
Transparency: Yes

You will NOT be able to connect to the VIP from the same subnet as the servers, so keep that in mind if you are testing the VIP and are monitoring availability internally on the same subnet.

Use Direct Server Routing (DSR) to Take a Load Off Your Load Balancer

With Direct Server Routing (DSR), you have to run at L4 and not pass the traffic through the VLM for inspection.  The advantages are the ability to off-load the traffic from having to pass through the LVM in general, but you miss out on the core advantages of the KEMP LoadMaster that are built-in services.

Direct server return (DSR) is a load balancing scheme that allows service requests to come in via the load balancer virtual IP (VIP). The responses are communicated by the back-end servers directly to the client. The load is taken off the load balancer as the return traffic is sent directly to the client from the back-end server, bypassing it entirely. You may want to do this if you have larger files to be served or traffic that doesn’t need to be transformed at all on its way back to the client.

Here’s how it works: Incoming requests are assigned a VIP address on the load balancer itself. Then the load balancer passes the request to the appropriate server while only modifying the destination MAC address to one of the back-end servers.

DSR workflow

You need to be aware of the following when using DSR:

  • Address resolution protocol (ARP) requests for the VIP must be ignored by the back-end servers if the load balancer and back-end servers are on the same subnet. If not, the VIP traffic routing will be bypassed as the back-end server establishes a direct connection with the client.
  • The servers handling the DSR requests must respond to heartbeat requests with their own IP and must respond to requests for content with the load balancer VIP.
  • Application acceleration is not a possibility because the load balancer does not handle the responses from the backend servers.

Here are the configuration steps for Linux (our VIP is 192.168.1.200 and our Physical Server IP is 192.168.1.88):

CentOS Server Configuration

Disable invalid ARP replies by adding the following to the /etc/sysctl.conf file.  Note:  If your interface is eth0, then make the appropriate change below:

Edit /etc/sysctl.conf

net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.ens32.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.ens32.arp_announce = 2

Now, reload sysctl values:

# sysctl -p

If using Layer 7 vs. DSR at Layer 4 – Edit the default route to point to the Load Balancer (1.252) vs. the normal default route.  Change the gateway address to the Load Balancer, then restart the network.

# vi /etc/sysconfig/network-scripts/ifcfg-ens32
# systemctl restart network

Add loopbacks for the service once we turn off the ability to respond to ARP requests

 Create an additional loopback interface with an IP alias (the load balancer VIP is represented by 192.168.1.200, use the ifconfig command:
# ifconfig lo:0 192.168.1.200 netmask 255.255.255.255

Enter the following command to verify configuration:

# ifconfig lo:0
lo:0: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 192.168.1.200 netmask 255.255.255.255
loop txqueuelen 0 (Local Loopback)

Note that if the machine reboots, this configuration will not be persistent. To set this permanently, some Linux configuration files need to be edited. Steps on how to do this vary from distribution to distribution, but typically you would add the interface to the appropriate loopback in your /etc/sysconfig/network-scripts/ifcfg-lo:X file.

Direct Routing and the ARP Limitation

While there are many advantages to using direct routing in Load Balancer , there are limitations as well. The most common issue with Load Balancer via direct routing is with Address Resolution Protocol (ARP).
In typical situations, a client on the Internet sends a request to an IP address. Network routers typically send requests to their destination by relating IP addresses to a machine’s MAC address with ARP. ARP requests are broadcast to all connected machines on a network, and the machine with the correct IP/MAC address combination receives the packet. The IP/MAC associations are stored in an ARP cache, which is cleared periodically (usually every 15 minutes) and refilled with IP/MAC associations.
The issue with ARP requests in a direct routing Load Balancer setup is that because a client request to an IP address must be associated with a MAC address for the request to be handled, the virtual IP address of the Load Balancer system must also be associated to a MAC as well. However, since both the LVS router and the real servers all have the same VIP, the ARP request will be broadcast to all the machines associated with the VIP. This can cause several problems, such as the VIP being associated directly to one of the real servers and processing requests directly, bypassing the LVS router completely and defeating the purpose of the Load Balancer setup.
To solve this issue, ensure that the incoming requests are always sent to the LVS router rather than one of the real servers. This can be done by either filtering ARP requests or filtering IP packets. ARP filtering can be done using the arptables utility and IP packets can be filtered using iptables or firewalld. The two approaches differ as follows:
  • The ARP filtering method blocks requests reaching the real servers. This prevents ARP from associating VIPs with real servers, leaving the active virtual server to respond with a MAC addresses.
  • The IP packet filtering method permits routing packets to real servers with other IP addresses. This completely sidesteps the ARP problem by not configuring VIPs on real servers in the first place.

Direct Routing Using arptables

In order to configure direct routing using arptables, each real server must have their virtual IP address configured, so they can directly route packets. ARP requests for the VIP are ignored entirely by the real servers, and any ARP packets that might otherwise be sent containing the VIPs are mangled to contain the real server’s IP instead of the VIPs.
Using the arptables method, applications may bind to each individual VIP or port that the real server is servicing. For example, the arptables method allows multiple instances of Apache HTTP Server to be running bound explicitly to different VIPs on the system.
However, using the arptables method, VIPs cannot be configured to start on boot using standard Red Hat Enterprise Linux system configuration tools.
To configure each real server to ignore ARP requests for each virtual IP addresses, perform the following steps:
  1. Create the ARP table entries for each virtual IP address on each real server (the real_ip is the IP the director uses to communicate with the real server; often this is the IP bound to ens32):
    arptables -A INPUT -d <virtual_ip> -j DROP
    arptables -A OUTPUT -s <virtual_ip> -j mangle --mangle-ip-s <real_ip>
    
    # arptables -A INPUT -d 192.168.1.200 -j DROP
    # arptables -A OUTPUT -s 192.168.1.200 -j mangle --manage-ip-s 192.168.1.88

    To list all entries:

    # arptables --list -n
    Chain INPUT (policy ACCEPT)
    -j DROP -d 192.168.1.200 
    
    Chain OUTPUT (policy ACCEPT)
    -j mangle -s 192.168.1.200 --mangle-ip-s 192.168.1.88
    
    Chain FORWARD (policy ACCEPT)

    This will cause the real servers to ignore all ARP requests for the virtual IP addresses, and change any outgoing ARP responses which might otherwise contain the virtual IP so that they contain the real IP of the server instead. The only node that should respond to ARP requests for any of the VIPs is the current active Load Balancer.

  2. Once this has been completed on each real server, save the ARP table entries by typing the following commands on each real server:
    # arptables-save > /etc/sysconfig/arptables
    # systemctl enable arptables.service

    The systemctl enable command will cause the system to reload the arptables configuration on bootup — before the network is started.

  3. Configure the virtual IP address on all real servers using ip addr to create an IP alias. For example:
    #

    ip addr add 192.168.76.24 dev eth0

  4. Configure Keepalived for Direct Routing. This can be done by adding lb_kind DR to the keepalived.conf file.

Direct Routing Using firewalld

You can avoid the ARP issue using the direct routing method by creating firewall rules using firewalld. To configure direct routing, add rules that create a transparent proxy so that a real server will service packets sent to the VIP address, even though the VIP address does not exist on the system.
The direct routing method is simpler to configure than the arptables method. This method also circumvents the ARP issue entirely, because virtual IP addresses only exist on the active Load Balancer.
However, there are performance issues using the direct routing method compared to arptables, as there is overhead in forwarding, with IP masquerading, every return packet.
You also cannot reuse ports using the direct routing method. For example, it is not possible to run two separate Apache HTTP Server services bound to port 80, because both must bind to INADDR_ANYinstead of the virtual IP addresses.
To configure direct routing for IPv4 using firewalld, perform the following steps on every real server:
  1. Enable IP masquerading for the zone of the network interface that receives the packets from the LVS. For example, for the external zone, as root:

    ~]# firewall-cmd --zone=public--add-masquerade --permanent

    If zone is omitted, the default zone is used. The --permanent option makes the setting persistent, but the command will only take effect at next system start. If required to make the setting take effect immediately, repeat the command omitting the --permanent option.

  2. Enter commands in the following format for every VIP, port, and protocol (TCP or UDP) combination intended to be serviced by the real server:
    firewall-cmd --zone=zone --add-forward-port=port=port_number:proto=protocol:toport=port_number:toaddr=virtual_IP_address

    For example, to configure TCP traffic on port 80 to be redirected to port 3753 at 192.168.10.10:

    # firewall-cmd --zone=external --add-forward-port=port=80:proto=tcp:toport=3753:toaddr=192.168.10.10 --permanent

    The --permanent option makes the setting persistent, but the command will only take effect at next system start. If required to make the setting take effect immediately, repeat the command omitting the --permanent option.

    This command will cause the real servers to process packets destined for the VIP and port that they are given.
  3. If required, to ensure firewalld is running:

    # systemctl start firewalld

    To ensure firewalld is enabled to start at system start:

    # systemctl enable firewalld

See also Red Hat Load Balancer Administration

System Architecture for Scaling Virtual Environment

All traffic should have ssl certs installed for each domain at nginx level.  Nginx is configured as ssl proxy only.  Certs are provided by Let’s Encrypt unless otherwise needed for other purposes.

(non-secure) Varnish (80) –> Apache (8080) –> Redis –> MariaDB

(secure) Nginx (as ssl proxy) (443) –> Varnish (80) –> Apache (8080) –> Redis –> MariaDB

 

Add Link here to Varnish CMS / WordPress, Joomla, Drupal config

Apache / PHP-FPM –> Redis –> MariaDB

 

WordPress Special Notes:

You have to tell WordPress that you are behind SSL and it will function properly. To accomplish this, I use the following code in wp-config.php

if ($_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') {
    $_SERVER['HTTPS']='on';
}

Be sure to refresh everything once you make you change:

# systemctl restart varnish; systemctl restart nginx; systemctl restart php70-php-fpm; systemctl restart httpd 

You may find yourself needing to download a WP plugin to help with any other issues.

Here are a couple to try:

  • https://mattgadient.com/remove-protocol/
  • https://wordpress.org/plugins/remove-http/

Misc Notes — Found (save) for possible varnish vcl changes




We run Varnish in between an F5 and Apache as well as use Nginx for ssl and load
> balancing in development, in conjunction with WordPress backends. You have to
> tell WordPress that you are behind SSL and it will function properly. To
> accomplish this I’d use the following code in wp-config.php
> 
> if ($_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') {
>        $_SERVER['HTTPS']='on';
> }
> 
> You can then also set FORCE_SSL_ADMIN and FORCE_SSL_LOGIN however you see fit
> and it should work. I saw some updates not that long ago to support proxy
> headers but don’t believe they are fully supported yet.
> 
> Jason
> 
> 
>> On Nov 2, 2015, at 12:37 PM, Carlos M. Fernández <cfernand at sju.edu> wrote:
>> 
>> Hi, Phil,
>> 
>> We don't use Nginx but do SSL termination at a hardware load balancer,
>> with most of the work to support that setup done in the VCL, and something
>> similar could possibly apply to your scenario.
>> 
>> Our load balancer can use different backend ports depending on which
>> protocol the client requests; e.g., if the client connects to port 80 for
>> HTTP, then the load balancer proxies that to Varnish on port 80, while if
>> the client connects to 443 for HTTPS the load balancer proxies to Varnish
>> on port 8008. The choice of Varnish port numbers doesn't matter, just the
>> fact that Varnish listens on both ports and that the load balancer uses
>> one or the other based on the SSL status with the client (using the
>> command line option "-a :80,8008" in this case).
>> 
>> Then, in vcl_recv, we have the following to inform the backend when an SSL
>> request has arrived:
>> 
>> if ( std.port( server.ip ) == 8008 ) {
>>    set req.http.X-Forwarded-Proto = "https";
>> }
>> 
>> We also have the following in vcl_hash to cache HTTP and HTTPS requests
>> separately and avoid redirection loops:
>> 
>> if ( req.http.X-Forwarded-Proto ) {
>>    hash_data( req.http.X-Forwarded-Proto );
>> }
>> 
>> The backend then can look for that header and respond accordingly. For
>> example, in Apache we set the HTTPS environment variable to "on":
>> 
>> SetEnvIf X_FORWARDED_PROTO https HTTPS=on
>> 
>> I have no knowledge of Nginx, but if it can be configured to use different
>> backend ports then you should be able to use the above.
>> 
>> Best regards,
>> --
>> Carlos.

If you changed your data location for mysql / mariadb, using a new disk and get an error like this after a start or restart:

# systemctl start mariadb
Job for mariadb.service failed because the control process exited with error code.
See "systemctl status mariadb.service" and "journalctl -xe" for details.

Here’s the extended status:

# systemctl status mariadb
 mariadb.service - MariaDB 10.1 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2016-03-29 11:49:49 CDT; 3s ago
Process: 5233 ExecStopPost=/usr/libexec/mysql-wait-stop (code=exited, status=0/SUCCESS)
Process: 5219 ExecStart=/usr/libexec/mysqld --basedir=/usr $MYSQLD_OPTS $_WSREP_NEW_CLUSTER (code=exited, status=1/FAILURE)
Process: 5185 ExecStartPre=/usr/libexec/mysql-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Process: 5162 ExecStartPre=/usr/libexec/mysql-check-socket (code=exited, status=0/SUCCESS)
Main PID: 5219 (code=exited, status=1/FAILURE)
Status: "MariaDB server is down"

Mar 29 11:49:48 db1 systemd[1]: Starting MariaDB 10.1 database server...
Mar 29 11:49:49 db1 mysqld[5219]: 2016-03-29 11:49:49 140524370049152 [Note] /usr/libexec/mysqld (mysqld 10.1.12-MariaDB) starting as process 5219 ...
Mar 29 11:49:49 db1 mysqld[5219]: 2016-03-29 11:49:49 140524370049152 [Warning] Can't create test file /var/lib/mysql/db1.lower-test
Mar 29 11:49:49 db1 systemd[1]: mariadb.service: main process exited, code=exited, status=1/FAILURE
Mar 29 11:49:49 db1 systemd[1]: Failed to start MariaDB 10.1 database server.
Mar 29 11:49:49 db1 systemd[1]: Unit mariadb.service entered failed state.
Mar 29 11:49:49 db1 systemd[1]: mariadb.service failed

It more than likely has everything to do with SELinux running.  To check, do this:

# getenforce
Enforcing

Check to see if this is the issue and temporarily allow us to allow access:

root@db1 /etc/my.cnf.d# semanage permissive -a mysqld_t

Start MariaDB:

root@db1 /etc/my.cnf.d# systemctl start mariadb

Looks good, let’s check the status:

root@db1 /etc/my.cnf.d# systemctl -l status mariadb
 mariadb.service - MariaDB 10.1 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2016-03-29 11:52:07 CDT; 13s ago
Process: 5233 ExecStopPost=/usr/libexec/mysql-wait-stop (code=exited, status=0/SUCCESS)
Process: 5493 ExecStartPost=/usr/libexec/mysql-check-upgrade (code=exited, status=0/SUCCESS)
Process: 5430 ExecStartPre=/usr/libexec/mysql-prepare-db-dir %n (code=exited, status=0/SUCCESS)
Process: 5407 ExecStartPre=/usr/libexec/mysql-check-socket (code=exited, status=0/SUCCESS)
Main PID: 5464 (mysqld)
Status: "Taking your SQL requests now..."
CGroup: /system.slice/mariadb.service
└─5464 /usr/libexec/mysqld --basedir=/usr

Mar 29 11:52:06 db1 systemd[1]: Starting MariaDB 10.1 database server...
Mar 29 11:52:07 db1 mysqld[5464]: 2016-03-29 11:52:07 140638769318016 [Note] /usr/libexec/mysqld (mysqld 10.1.12-MariaDB) starting as process 5464 ...
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: The datadir located at /var/lib/mysql needs to be upgraded using 'mysql_upgrade' tool. This can be done using the following steps:
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: 1. Back-up your data before with 'mysql_upgrade'
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: 2. Start the database daemon using 'service mariadb start'
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: 3. Run 'mysql_upgrade' with a database user that has sufficient privileges
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: Read more about 'mysql_upgrade' usage at:
Mar 29 11:52:07 db1 mysql-check-upgrade[5493]: https://mariadb.com/kb/en/mariadb/documentation/sql-commands/table-commands/mysql_upgrade/
Mar 29 11:52:07 db1 systemd[1]: Started MariaDB 10.1 database server.

SIDE NOTE:  Looks like we need to upgrade our disk mounted at /var/lib/mysql now:

# mysql_upgrade -hlocalhost -uroot -pxxxxxxxxxxxxxx
Phase 1/6: Checking and upgrading mysql database
Processing databases
mysql
mysql.columns_priv OK
mysql.db OK
mysql.event OK
mysql.func OK
mysql.help_category OK
mysql.help_keyword OK
mysql.help_relation OK
mysql.help_topic OK
mysql.host OK
mysql.ndb_binlog_index OK
mysql.plugin OK
mysql.proc OK
mysql.procs_priv OK
mysql.proxies_priv OK
mysql.servers OK
mysql.tables_priv OK
mysql.time_zone OK
mysql.time_zone_leap_second OK
mysql.time_zone_name OK
mysql.time_zone_transition OK
mysql.time_zone_transition_type OK
mysql.user OK
Phase 2/6: Fixing views
Phase 3/6: Running 'mysql_fix_privilege_tables'
Phase 4/6: Fixing table and database names
Phase 5/6: Checking and upgrading tables
Processing databases
information_schema
performance_schema
Phase 6/6: Running 'FLUSH PRIVILEGES'
OK

Now, let’s permanently & persistently allow for SELinux to allow us to write to the new disk:

Start by temporarily putting mysqld_t into permissive mode (should already be there):

# semanage permissive -a mysqld_t

Now, we record what MariaDB is doing and to create a policy to allow that (but nothing else).

Switch SELinux to permissive mode and remove dontaudits from the policy:

# semodule -DB
# semanage permissive -a mysqld_t

Start MariaDB, then use the generated audit log to create a policy:

# grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
# semodule -i mariadb_local.pp

Let’s adjust the file system labelling so that SELinux knows this is MariaDB’s datadir. Check with:

# ls -ldaZ /datadir
drwxr-xr-x. root root unconfined_u:object_r:var_t:s0
# semanage fcontext -a -t mysqld_db_t "/var/lib/mysql(/.*)?"
# restorecon -Rv /var/lib/mysql

You can check what’s in the policy here:

# semanage fcontext -l -C
SELinux fcontext                                   type               Context

/var/lib/mysql(/.*)?                               all files          system_u:object_r:mysqld_db_t:s0 

SELinux Local fcontext Equivalence 

/opt/remi/php70/root = /
/var/opt/remi/php70 = /var
/etc/opt/remi/php70 = /etc

Now, remove the permissive mode for mysqld_t and restore dontaudits:

# semodule -B
# semanage permissive -d mysqld_t