dtla92562 2019-05-07 09:56
浏览 73

网页请求10张图片和一些文本在nginx和php-fpm冻结并断开其他服务?

is it possible that a request for a page, where the server or php might have an issue freezes and even disconnects other not related SSH services?

I am running a simple webpage (10 pictures and some text) on a dockerized environment with separate reverse proxy, a web server, a database (nginx, php-fpm and postgresql).

The whole system was up without a restart for a year or so, without problems. Now I have a newly occurring issue (about a month) with page/system freezes. When I visit my webpage it locks up from time to time (sometimes 1 instance is enough, other times, I need to open up to 20x) and needs about 30 seconds to start reacting again.The strange thing is that if I am connected in parallel with SSH to the server, it sometimes (not always) also disconnects my terminal. Which is why I believed it hast to do something with the system (but can't find anything there, so trying a different perspective here).

server (only remote access available): Debian GNU/Linux 9.4 (stretch) Kernel: 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux 68GB Ram, 8 Core, 2x4 TB HDDs and 1TB SDD 1 GBit-Uplink

I have monitoring installed and there does not seem to be any high workload on the IOs, network, CPU, or other during the lock up (I am not monitoring php stats though). I also have the same setup running on a local test server (different hardware and Kernel 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux) and that server has no freezing issues, so again an argument against the issue being with the dockerized environment or my page code.

I have done so far on the hardware side:

  • 1.) SMART diagnostics - without any obvious issues (the "backup disk (not the one the servers are saved on)" has for some time: 191 G-Sense_Error_Rate 0x0032 001 001 000 , but the provider ran a separate test some time ago and said that the disk has no issue, and that the G-Sense_Error_Rate has little informational value anyhow)
  • 2.) atop ( htop and iotop are live and SSH disconnects, thus I can't watch it as the problem occurs) over a 1s interval and 300 samples (thus 5mins), where i was able to produce multiple freezes, but there were no obvious load issues (granted this is the first time I am looking at those things! - but there was also no high level line coloring that atop does automatically)
  • 3.) I have also a dockerized monitoring stack running (the freeze occurs with it running and with it being disabled, so it should not come from here either) where I can view the dockers separately and they also do not show anything alarming
  • 4.) restarted the whole server - issue continues
  • 5.) memtester-d 55 of 65 RAM without issues
  • 6.) no problems in syslog
  • 7.) ping the server, while producing the error and the ping is quick with 27ms, but when the server hangs, I lose 1 ping in about 10 (in those 30-40s, then ping is perfect again). But I cannot figure out, why that is

Where else could I look????

Any suggestions are highly appreciated! Thanks!

  • 写回答

1条回答 默认 最新

  • duanbicheng3345 2019-05-07 10:06
    关注

    Strange that this has only started to happen within the last few months and was fine previously.

    Are you pulling down the latest image for nginx, postgres... etc? Maybe its a problem with the version of the images and could try using a specific release.

    评论

报告相同问题?

悬赏问题

  • ¥15 求帮我调试一下freefem代码
  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图