Hacker News new | past | comments | ask | show | jobs | submit login

> It had problems with signatures when messages contained a certain field, so some messages appeared and others were missing.

Did you file a bug on this? We're not aware of any signature problems whatsoever in Synapse (or Matrix).

> Why do messages have so many fields?

So looking at a message like this:

  {
    "origin_server_ts": 1517313544657,
    "sender": "@kitsune:matrix.org",
    "event_id": "$15173135441073284GrtsX:matrix.org",
    "unsigned": {
      "age": 277
    },
    "content": {
      "body": "Yep, almost a year ago.",
      "msgtype": "m.text"
    },
    "type": "m.room.message",
    "room_id": "!eUGMvloIjhBoAwlyRh:matrix.org"
  }
You get:

* `origin_server_ts`: the absolute TS the sending server claims * `sender`: the matrix ID of the sender * `event_it`: the UID of the event * `unsigned.age`: a relative TS for receipt on your local server (added by your local server, hence being unsigned) * `type`: the namespaced type of the event. In this instance, `m.room.message` is the general purpose Instant Message event type. * `content`: arbitrary JSON to describe the contents of this type of event. In this instance, it's a plain text message with a given body. * `room_id`: the room the event's been sent in.

This doesn't seem unreasonable? (but I may be biased thanks to working on it).

> To top it off, the server was making me hit my ISPs (pretty bad ISP) connections per second limit.

Oops. This might have been a while ago before we improved the connection pooling for federation traffic? What's the limit?

> Apparently, development focus isn't on making the basic protocol perfect, but on UX nonsense and adding features

We're trying to do both. A perfect protocol is useless if it doesn't have flagship apps which make it usable by normal people. Right now the protocol is relatively good; the Python/Twisted impl is very heavyweight (but getting better); Riot (as a flagship app) also has perf problems but it hopefully headed in the right direction.

Dendrite should be an enormous improvement on the serverside when it gets there however!




> issue

To be clear, it got solved, and it happened several months ago.

> the Python/Twisted impl is very heavyweight (but getting better)

Constantly near-saturating rpi2 cpu while doing effectively nothing last time I turned it on. (70%+ cpu time). This is why I'm thinking I'll wait for golang until I try again.

>Right now the protocol is relatively good

Couple questions:

- If I delete my server database, start anew, and join a room I used to be in, will I end up in a bad state, or are things more robust these days? I remember reading about it somewhere.

- Does the server _still_ solve a bunch of names and open hundreds of connections / server event interval or whatever it was called?


> If I delete my server database, start anew, and join a room I used to be in, will I end up in a bad state, or are things more robust these days? I remember reading about it somewhere.

It should end up (eventually) in a consistent state, but it can take a while to sync up again.

> Does the server _still_ solve a bunch of names and open hundreds of connections / server event interval or whatever it was called?

It's still full mesh, so whenever you send a message in a room, your server has to make a HTTPS hit to every other server which is participating in that room. In a massive room like Matrix HQ, this could mean 800 hits or so. It shouldn't do DNS every time, and it shouldn't open a new connection every time, but haven't checked the connpooling recently; hopefully it hasn't regressed.


>whenever you send a message in a room

Just to be clear, I saw this behavior without sending anything to any room. Just by being in the room.

If the server interval setting was 5 seconds, it'd literally do hundreds of connections every 5 seconds.


there is no such server interval setting, and there never has been? i can only assume that this was the retry schedule doing exponential backoff, trying to contact servers that are down. currently we don't have the concept of shared retry hints (as deciding whose hints to trust would be hard), so every server has to work out which servers are available in the mesh itself. After about 10 minutes it calms down.


homeserver.yaml

# The federation window size in milliseconds

#federation_rc_window_size: 1000

federation_rc_window_size: 60000

That's how high it needs to be set, apparently.


That doesn't control how aggressively the server connects out to other servers though - it limits how rapidly the server processes inbound requests; upping the window from 60s to 1s means that it will only process 10 requests from a given server in a 60s window (rather than 1s window) before deliberately falling behind. Interesting if changing it helped your problem; not sure how to interpret that.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: