On Wed, Oct 19, 2011 at 2:49 PM, Sam Boyer <
act@samboyer.org> wrote:
> the concern with rackspace would be this kinda thing:
>
>
https://www.eff.org/cases/indymedia-server-takedown
>
> On 10/19/11 11:41 AM, Chaz Cheadle wrote:
>> I can be available tomorrow in the city, not today though.
>> Why not stay with Rackspace? From reviewing the Panix website, I was not
>> bowled over with confidence at its reliability. I extremely happy with
>> Rackspace technical support. If we can afford to stay with them, I'd do
>> it. Depending on the traffic we hit, their transfer rates are pretty
>> competitive.
>>
>> On Wed, Oct 19, 2011 at 2:33 PM, Tom Gillis <
thomaswgillis@gmail.com
>> <mailto:
thomaswgillis@gmail.com>> wrote:
>>
>> *Short summary - site is on Rackspace Cloud, not Panix, for now.
>> *Let's try to get interested sysadmin folks in a room or skype tongiht.
>>
>> Chaz / Kevin - what's your real-time availability in NYC today ? We'd
>> like to have another work session - Dan R and Jake also have root on
>> the server and I'd like to get us all in a room or at least a skype to
>> coordinate (I won't be available onsite until tonight).
>>
>> BTW - one other thing that we need to come to a consensus on is our
>> longterm hosting provider - the ppl at the work session decided last
>> night (around 2 AM) that after the Panix hosting server that i had
>> access on became unresponsive (looking into the cause of this) and
>> none of us had the account creds to restart the server or provision
>> another one we moved the site to a rackspace cloud account (cloned
>> from a legacy machine that I have set up as a LAMP box for freelance
>> projects). Not the best setup but the best we could do at the last
>> minute, with calling off the site launch (for the 2nd time in a week)
>> not an option (since the staging site url was already starting to leak
>> to the general public and it just mean that the data / user migration
>> was going to get more complicated as time went by).
>>
>> ANYWAY - folks on the ground can sync up with Jake or Dan to get
>> access to the box, or we can do it over skype later (I'm trying to
>> stick to not sending credentials over the internet).
>>
>> There may be some duplication of effort here - getting a deployment
>> going that will get us thru the next few weeks, and then coming up
>> with a better long term solution.
>>
>>
>>
>> On Wed, Oct 19, 2011 at 1:13 PM, Kevin <
king.feruke@gmail.com
>> <mailto:
king.feruke@gmail.com>> wrote:
>> > lighttpd/Nginx or any async server is exactly where I was going
>> with fcgi
>> > ... Having them use fam/gamin will help file stats if we find I/o
>> being the
>> > problem
>> >
>> > Sorry for short mails but I'm mobile only for a while
>> >
>> > DON'T PANIC
>> >
>> > On Oct 19, 2011 12:05 PM, "Sam Boyer" <
act@samboyer.org
>> <mailto:
act@samboyer.org>> wrote:
>> >>
>> >> i wish i could volunteer to hit this round the clock, but i just
>> can't
>> >> at the moment :( :(
>> >>
>> >> some thoughts on scaling this thing. first - we don't know squat
>> until
>> >> we get into the box. second, once we do, installing some monitoring
>> >> tools - e.g. cacti - should be high priority, otherwise, we're just
>> >> gonna be flailing around in the dark. nagios is fine, but that'll
>> get us
>> >> monitoring, not usage logs. alternatively/additionally we could
>> look at
>> >> paying for monitoring from a service like new relic (which might be
>> >> better if only because it means less that we have to maintain
>> ourselves,
>> >> at least at first). beyond that:
>> >>
>> >> - get xhprof onto a prod clone somewhere so we can actually look at
>> >> what's taking up the processing time. beyond low-hanging fruit,
>> though,
>> >> that's gonna take some expertise to actually make a dent with.
>> >> - big duh, but we've got an opcode cache running...right? the
>> site seems
>> >> too responsive right now for this NOT to be the case.
>> >> - getting mysql onto baremetal, or rackspace cloud (though that would
>> >> mean moving everything to rackspace, and i've already heard security
>> >> concerns about that), should probably be a priority. heavy db io
>> through
>> >> virt layer...meh.
>> >> - due to the 1s-minimum granularity issue the mysql slow query log is
>> >> almost a too-late-to-be-useful thing (unless the percona people
>> FINALLY
>> >> got that patch in to add ms granularity in mainline...but i doubt
>> it),
>> >> but we do need to run it, as it'll give us a hit list of queries for
>> >> optimization and/or caching.
>> >> - as kevin mentioned on the other thread, fcgi; and if we do that,
>> >> really, no reason not to switch to nginx. i don't know what our
>> request
>> >> volume looks like so i don't know how much we'd be getting back
>> there,
>> >> but really, there's no reason to be serving static assets with bloaty
>> >> apache workers.
>> >> - ordinarily, for a drupal site of this type, i'd advocate ESI. i
>> have
>> >> no idea how well WP supports content chunking like that (and
>> truth is,
>> >> good ESI strategies take a *while* to craft), but at the very
>> least some
>> >> internal data caching could help with query volume (e.g., cache the
>> >> output of the query that generates the global activity feed for
>> 30s or
>> >> so). again, though, i don't know how easy that is to layer in
>> with WP,
>> >> and the more custom we get, the more difficult it's gonna be to
>> maintain.
>> >>
>> >> like i said, though, until we actually *know* where the
>> problem(s) are,
>> >> we can't address them. also, somewhere in this thread i remember
>> seeing
>> >> someone set up for the expectation that we might just need
>> ~500MB/proc.
>> >> dear god, i hope not. if that's the case, we could blow the
>> entire war
>> >> chest that's been accumulated thus far for liberty plaza (~$230,000 i
>> >> read somewhere) and still only be able to support a several hundred
>> >> concurrent users. that needs to be brought *down*.
>> >>
>> >> cheers
>> >> s
>> >>
>> >> On 10/19/11 8:47 AM, Kevin wrote:
>> >> > Agree nagios for the win, we should get logwatch going as well
>> >> >
>> >> > Rackspace cloud machines guarantee proc and allow busting if
>> available.
>> >> >
>> >> > Once we have more than one machine we need to think about config
>> >> > mgmt...i would suggest puppet. We could use blueprint to
>> analyze the
>> >> > current machine and generate puppet files.
>> >> >
>> >> >
https://github.com/devstructure/blueprint
>> >> >
>> >> > DON'T PANIC
>> >> >
>> >> > On Oct 19, 2011 11:32 AM, "Chaz Cheadle" <
ccheadle@gmail.com
>> <mailto:
ccheadle@gmail.com>
>> >> > <mailto:
ccheadle@gmail.com <mailto:
ccheadle@gmail.com>>> wrote:
>> >> >
>> >> > I'd like to suggest zenoss/nagios for monitoring.
>> >> > As for hardware configurations, I'd say we definitely
>> should have
>> >> > physical/dedicated DB servers with cloud webhosting. Unless
>> we're on
>> >> > Rackspace or Linode, it may be hard to ensure we'll get the
>> needed
>> >> > processor or I/O from a vps.
>> >> > If we have one server now, we can start serving the whole
>> thing
>> >> > from there, then purchase cloud webservers to lighten the
>> webload
>> >> > then add mysql replication in later if the DB reads start
>> getting
>> >> > high. Unless we're doing heavy editing on the site one DB
>> server for
>> >> > now should handle all of the read requests.
>> >> > With Zen/nagios we will be able to monitor the server
>> and make
>> >> > decisions on expansion. Let's figure out the resource issue
>> we have
>> >> > now with WP before jumping to cloud web hosts and MySql
>> replication.
>> >> >
>> >> > What is the current panix host package we're on?
>> >> >
>> >> > chaz
>> >> >
>> >> > On Wed, Oct 19, 2011 at 11:14 AM, Todd Grayson
>> <
tgraysonco@gmail.com <mailto:
tgraysonco@gmail.com>
>> >> > <mailto:
tgraysonco@gmail.com
>> <mailto:
tgraysonco@gmail.com>>> wrote:
>> >> >
>> >> > Adding Chaz and Kevin,
>> >> >
>> >> > guys once consensus can be reached with dev leads,
>> folks can be
>> >> > ID'd and get started on the "how to move forward" as a
>> working
>> >> > team? Tom needs additional eyballs and hands covering
>> production
>> >> > deploy as well as ongoing release engineering. The
>> subject is
>> >> > going to become a bigger deal as work continues.
>> Please review,
>> >> > and lets get a working plan together that approaches
>> the list in
>> >> > a way that resource on-boarding is clean and effective?
>> >> >
>> >> > Todd
>> >> >
>> >> > On 10/19/2011 8:06 AM, Todd Grayson wrote:
>> >> >> OK:
>> >> >>
>> >> >> As a conversation before going to the list, I'm
>> reaching out
>> >> >> to you folks to establish consensus on what is going
>> to happen
>> >> >> next. Please identify WHO should be included in this
>> >> >> conversation not currently a part of it. Once
>> concensus is in
>> >> >> place we can go to the lists for specific volunteers.
>> To make
>> >> >> this efficient and quick the team in NYC should have
>> on hand
>> >> >> the following items for the folks coming forward:
>> >> >>
>> >> >> * Development leads who are overseeing configuration for
>> >> >> current wordpress deploy and able to answer questions
>> >> >> o available for q&a and facilitating access to
>> repo's
>> >> >> etc. when needed
>> >> >> * ID who is the contacts are for the panix hosting
>> >> >> services, a conference call with them to talk
>> through of
>> >> >> what is being seen now and what we feel will be
>> needed to
>> >> >> reach capacity should be scheduled ASAP
>> >> >> * Is there any way to get current perf statistics
>> from where
>> >> >> its running now where its at?
>> >> >>
>> >> >>
>> >> >> The call for specific volunteers will be based on the
>> fact we
>> >> >> need a team folks to help out with systems and DB
>> >> >> administration tasks as well as performance tuning and
>> >> >> capacity planning. This will give the working
>> technical team
>> >> >> depth and allow for a more contineous support model as one
>> >> >> worker will only have limited hours in a day to
>> contribute,
>> >> >> where as a team model can support sustained activity
>> over a
>> >> >> period of time.
>> >> >>
>> >> >> Here is what is needed from the current and previous
>> volunteer
>> >> >> lists as well as contacts on the ground once they are
>> >> >> identified;
>> >> >>
>> >> >> * Technical Project Manager
>> >> >> * Linux systems administrators with web hosting
>> backgrounds
>> >> >> (and virtual hosting infrastructure)
>> >> >> * MySql DBA's supporting web hosted env's, wordpress
>> >> >> environments
>> >> >>
>> >> >> Resources like this have already come forward to the
>> IWG list,
>> >> >> we can start with these people. IMHO a target team of 6
>> >> >> people should be the goal (3 dba, 3 sysadmin)
>> >> >> MySql
>> >> >>
>> >> >>
>>
http://groups.google.com/group/internet_working_group/browse_thread/thread/4bde061a2adacee6/ede85b8a1812c3cc?lnk=gst&q=MySQL#ede85b8a1812c3cc
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/4bde061a2adacee6/ede85b8a1812c3cc?lnk=gst&q=MySQL#ede85b8a1812c3cc>
>> >> >>
>> >> >>
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/4bde061a2adacee6/ede85b8a1812c3cc?lnk=gst&q=MySQL#ede85b8a1812c3cc
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/4bde061a2adacee6/ede85b8a1812c3cc?lnk=gst&q=MySQL#ede85b8a1812c3cc>>
>> >> >> Linux Administration
>> >> >>
>> >> >>
>>
http://groups.google.com/group/internet_working_group/browse_thread/thread/694bde580564c681/f4afcbc78aa06aa3?lnk=gst&q=Linux+Administration#f4afcbc78aa06aa3
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/694bde580564c681/f4afcbc78aa06aa3?lnk=gst&q=Linux+Administration#f4afcbc78aa06aa3>
>> >> >>
>> >> >>
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/694bde580564c681/f4afcbc78aa06aa3?lnk=gst&q=Linux+Administration#f4afcbc78aa06aa3
>> <
http://groups.google.com/group/internet_working_group/browse_thread/thread/694bde580564c681/f4afcbc78aa06aa3?lnk=gst&q=Linux+Administration#f4afcbc78aa06aa3>>
>> >> >>
>> >> >> This call for volunteers will be the creation of team that
>> >> >> will be dedicated to the infrastructure of the wordpress
>> >> >> sites, the DB infrastructure supporting them, and the
>> apache /
>> >> >> php / wordpress install and configuration over your
>> >> >> dev/test/release environments moving forward. The
>> folks that
>> >> >> will be coming forward from online will need to be
>> included in
>> >> >> communication, brought into the planning, and then
>> included in
>> >> >> communications as a team moving forward.
>> >> >>
>> >> >> If you want to start the ball rolling on this let me
>> know who
>> >> >> the contacts are from the "on the ground" requirements
>> and we
>> >> >> can get going asap.
>> >> >>
>> >> >> IMHO the actual MySql DB's might have to be on physical
>> >> >> hardware if the IO we are seeing on the VM's shared
>> backplane
>> >> >> is the bottleneck.... or just reside in a MySQL DB
>> farm.. That
>> >> >> will have to be evaluated with iostat output as system
>> access
>> >> >> is regained and the cause is isolated. It might just
>> simply
>> >> >> be memory related; disk IO pressure as paging/swap
>> attempted
>> >> >> to scale for the demand of resources. If we know
>> the right
>> >> >> process names ssh pkill statements can be sent to try
>> and free
>> >> >> up the system as well?
>> >> >>
>> >> >> ssh username@hostfqdn 'pkill httpd'
>> >> >>
>> >> >> Todd
>> >> >>
>> >> >> On 10/19/2011 6:33 AM, Tom Gillis wrote:
>> >> >>> And I feel like "scalable wordpress deployment" is a
>> little
>> >> >>> bit of an
>> >> >>> oxymoron - but:
>> >> >>> good news - we have the nycga 2.0 site up, and the
>> >> >>> functionality is
>> >> >>> all working as expected.
>> >> >>>
>> >> >>> bad news - we needed to rush deployment so that
>> working groups
>> >> >>> could
>> >> >>> start using new features, but wordpress is killing
>> the cpu /
>> >> >>> memory on
>> >> >>> the server (a 16gb virual box) and we know that a single
>> >> >>> server
>> >> >>> hosting setup is not going to be viable.
>> >> >>>
>> >> >>> caching doesn't help us much since most of the content is
>> >> >>> dynamic, and
>> >> >>> near-real time - it's wordpress with budypress on top so
>> >> >>> there's tons
>> >> >>> of forums, and social-networky activity feeds.
>> >> >>>
>> >> >>> what we need:
>> >> >>> 1 - move mysql to its own server, and set up master /
>> slave
>> >> >>> replication (2 virtual servers)
>> >> >>> 2 - set up a shared file hosting server for user-uploaded
>> >> >>> images - nfs
>> >> >>> mounts to a single box (1 box)
>> >> >>> 3 - setting up load-balanced web frontends with sticky
>> >> >>> sessions (4
>> >> >>> virtual boxes probably)
>> >> >>>
>> >> >>> I'm hoping to find a few people who will volunteer to
>> work
>> >> >>> with
>> >> >>> internet group, either in nyc or remotely, over the
>> next 72
>> >> >>> hrs to
>> >> >>> make a push to get this infrastructure in place. in
>> parallel,
>> >> >>> we'll
>> >> >>> be making code optimizations to the site. (lots of
>> >> >>> low-hanging fruit
>> >> >>> here, like minifying js and css). i'm hoping to find
>> >> >>> somebody who
>> >> >>> can set up one aspect of the infrastructure and I'll
>> hook them
>> >> >>> up with
>> >> >>> a cloned version of the production server, which they can
>> >> >>> modify to
>> >> >>> fulfill one of these other roles - then we can deploy
>> that
>> >> >>> back into
>> >> >>> the main infrastructure. I'm probably going to be
>> asleep
>> >> >>> until
>> >> >>> around 1pm nyc time, but I'm hoping to have some
>> volunteers by
>> >> >>> the
>> >> >>> time I come back. And right now we really need people
>> who can
>> >> >>> free up
>> >> >>> most of their time for the rest of the week on this
>> (we're
>> >> >>> literally
>> >> >>> working around the clock in nyc so you'll have people to
>> >> >>> coordinate
>> >> >>> with no matter where / when you're available)
>> >> >>>
>> >> >>> Any takers?
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>>
>>
>
>