Problem with Postfix email sending


#1

I am sending some bulk emails to all the users registered on the WordPress site. There are above 1000 users. Unfortunately, the email is being sent to only half of the people. It stops after sending some random number of emails. I am not sure why it is happening. I have checked with php timeout, WordPress error log, and mail.log file. but nothing that is related to the problem.

I am using WP Mail SMTP with Google OAuth 2.0. But I don`t think it is the issue with the OAuth.

Also, it is not the issue with the Gmail sending limits, as other emails are being triggered after this.

Here is my postfix configuration


alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
config_directory = /etc/postfix
inet_interfaces = all
inet_protocols = all
mailbox_size_limit = 0
mydestination = instance-1.c.abiding-triode-178713.internal, localhost.c.abiding-triode-178713.internal, localhost
myhostname = instance-1.c.abiding-triode-178713.internal
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
myorigin = /etc/mailname
readme_directory = no
recipient_delimiter = +
relayhost =
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination
smtpd_tls_cert_file = /etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file = /etc/ssl/private/ssl-cert-snakeoil.key
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtpd_use_tls = yes

Following is the email log of that time: It was at around 4:00

Apr  3 03:01:20 instance-1 postfix/qmgr[1939]: 10BDD4067D: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 03:01:20 instance-1 postfix/qmgr[1939]: 0D0DE4060A: from=<>, size=2828, nrcpt=1 (queue active)
Apr  3 03:01:50 instance-1 postfix/smtp[21679]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 03:01:50 instance-1 postfix/smtp[21680]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 03:01:50 instance-1 postfix/smtp[21679]: 10BDD4067D: to=, relay=none, delay=210552, delays=210522/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 03:01:50 instance-1 postfix/smtp[21680]: 0D0DE4060A: to=, relay=none, delay=210582, delays=210552/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 03:06:20 instance-1 postfix/qmgr[1939]: 3D2F640620: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 03:06:50 instance-1 postfix/smtp[21690]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 03:06:50 instance-1 postfix/smtp[21690]: 3D2F640620: to=, relay=none, delay=210583, delays=210553/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:06:21 instance-1 postfix/qmgr[1939]: BBDFD40849: from=<>, size=2767, nrcpt=1 (queue active)
Apr  3 04:06:51 instance-1 postfix/smtp[22132]: connect to vanntage-in.mail.protection.outlook.com[23.103.140.170]:25: Connection timed out
Apr  3 04:07:21 instance-1 postfix/smtp[22132]: connect to vanntage-in.mail.protection.outlook.com[23.103.140.138]:25: Connection timed out
Apr  3 04:07:21 instance-1 postfix/smtp[22132]: BBDFD40849: to=, relay=none, delay=208482, delays=208421/0.01/60/0, dsn=4.4.1, status=deferred (connect to vanntage-in.mail.protection.outlook.com[23.103.140.138]:25: Connection timed out)
Apr  3 04:11:21 instance-1 postfix/qmgr[1939]: 10BDD4067D: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 04:11:21 instance-1 postfix/qmgr[1939]: 0D0DE4060A: from=<>, size=2828, nrcpt=1 (queue active)
Apr  3 04:11:51 instance-1 postfix/smtp[22200]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:11:51 instance-1 postfix/smtp[22201]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:11:51 instance-1 postfix/smtp[22200]: 10BDD4067D: to=, relay=none, delay=214752, delays=214722/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:11:51 instance-1 postfix/smtp[22201]: 0D0DE4060A: to=, relay=none, delay=214782, delays=214752/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:16:21 instance-1 postfix/qmgr[1939]: 3D2F640620: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 04:16:51 instance-1 postfix/smtp[22219]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:16:51 instance-1 postfix/smtp[22219]: 3D2F640620: to=, relay=none, delay=214783, delays=214753/0.01/30/0, dsn=4.4.1, status=deferred (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)

After Analysing I found that There were few emails in the postfix queue, I removed all of it and tested. After that my mail.log file is looking like

Apr  4 10:01:34 instance-1 postfix/postsuper[2460]: Deleted: 4 messages
Apr  4 10:01:56 instance-1 postfix/smtp[2404]: connect to vanntage-in.mail.protection.outlook.com[23.103.140.138]:25: Connection timed out
Apr  4 10:01:56 instance-1 postfix/smtp[2404]: BBDFD40849: to=, relay=none, delay=316157, delays=316097/0/60/0, dsn=4.4.1, status=deferred (connect to vann$

I amnot sure from where above two entries is coming from.

Following is the syslog output of that time: The domain I am using is enrollment.ethika.co.in


Apr  3 03:11:31 instance-1 mysqld: 2018-04-03  3:11:31 140331710982912 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEM
ENT. The statement is unsafe because it uses a LIMIT clause. This is unsafe because the set of rows included cannot be predicted. Statement: delete from wp_wfStatus where leve
l != 10 order by ctime asc limit 87
Apr  3 03:11:31 instance-1 mysqld: 2018-04-03  3:11:31 140331710982912 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEM
ENT. The statement is unsafe because it uses a LIMIT clause. This is unsafe because the set of rows included cannot be predicted. Statement: delete from wp_wfStatus where leve
l = 10 order by ctime asc limit 41
Apr  3 03:17:01 instance-1 CRON[21994]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Apr  3 03:39:01 instance-1 CRON[22036]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Apr  3 03:39:01 instance-1 CRON[22037]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessioncle
an /var/lib/php5 $(/usr/lib/php5/maxlifetime))
Apr  3 04:06:21 instance-1 postfix/qmgr[1939]: BBDFD40849: from=<>, size=2767, nrcpt=1 (queue active)
Apr  3 04:06:51 instance-1 postfix/smtp[22132]: connect to vanntage-in.mail.protection.outlook.com[23.103.140.170]:25: Connection timed out
Apr  3 04:07:21 instance-1 postfix/smtp[22132]: connect to vanntage-in.mail.protection.outlook.com[23.103.140.138]:25: Connection timed out
Apr  3 04:07:21 instance-1 postfix/smtp[22132]: BBDFD40849: to=, relay=none, delay=208482, delays=208421/0.01/60/0, dsn=4.4.1, status=deferred (connect to v
anntage-in.mail.protection.outlook.com[23.103.140.138]:25: Connection timed out)
Apr  3 04:09:01 instance-1 CRON[22140]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Apr  3 04:09:01 instance-1 CRON[22141]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessioncle
an /var/lib/php5 $(/usr/lib/php5/maxlifetime))
Apr  3 04:11:21 instance-1 postfix/qmgr[1939]: 10BDD4067D: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 04:11:21 instance-1 postfix/qmgr[1939]: 0D0DE4060A: from=<>, size=2828, nrcpt=1 (queue active)
Apr  3 04:11:51 instance-1 postfix/smtp[22200]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:11:51 instance-1 postfix/smtp[22201]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:11:51 instance-1 postfix/smtp[22200]: 10BDD4067D: to=, relay=none, delay=214752, delays=214722/0.01/30/0, dsn=4.4.1, status=deferr
ed (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:11:51 instance-1 postfix/smtp[22201]: 0D0DE4060A: to=, relay=none, delay=214782, delays=214752/0.01/30/0, dsn=4.4.1, status=deferr
ed (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:16:21 instance-1 postfix/qmgr[1939]: 3D2F640620: from=<>, size=2806, nrcpt=1 (queue active)
Apr  3 04:16:51 instance-1 postfix/smtp[22219]: connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out
Apr  3 04:16:51 instance-1 postfix/smtp[22219]: 3D2F640620: to=, relay=none, delay=214783, delays=214753/0.01/30/0, dsn=4.4.1, status=deferr
ed (connect to enrollment.ethika.co.in[35.202.0.207]:25: Connection timed out)
Apr  3 04:17:01 instance-1 CRON[22223]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Apr  3 04:39:01 instance-1 CRON[22267]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Apr  3 04:39:01 instance-1 CRON[22268]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime))
Apr  3 04:59:00 instance-1 mysqld: 2018-04-03  4:59:00 140331751320320 [Warning] Aborted connection 11883 to db: 'netcracker' user: 'root' host: 'localhost' (Got an error reading communication packets)
Apr  3 05:09:01 instance-1 CRON[22614]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Apr  3 05:09:01 instance-1 CRON[22615]: (root) CMD (  [ -x /usr/lib/php5/maxlifetime ] && [ -x /usr/lib/php5/sessionclean ] && [ -d /var/lib/php5 ] && /usr/lib/php5/sessionclean /var/lib/php5 $(/usr/lib/php5/maxlifetime))

I tested it again:

Following is the data for todays tests

Sent Count: 9761 To send: 351 Sent: 300 New Sent Count: 10061

Sent Count: 10068 To send: 51 Sent: 46 New Sent Count: 10114

Sent Count: 10114 To send: 5 Sent: 5 New Sent Count: 10119

All these actions are being performed by the cron job. And it is definitely not the issue of the timeout as I was performing some operation on user data and it was applied to all the users.

Please suggest something.


#2

I am experiencing a similar problem. Have you figured out?


#3

I figured the problem. In my case it was a AWS new account and this: https://aws.amazon.com/premiumsupport/knowledge-center/ec2-port-25-throttle/ After request to remove the limitations the email problem was gone.