r3 - 21 Jan 2004 - 11:53:53 - MimiYinYou are here: OSAF >  Jungle Web  >  ObsoleteDocuments > ChandlerDiscussionTopics > DataSharing

Data sharing in Canoga

  1. Introduction
  2. Base requirements
  3. Base technical discussion
  4. Base design
  5. Groups
  6. Groups technical discussion
  7. Groups design
  8. Delegation
  9. Technical notes

Introduction

Data sharing is one of the Chandler's 'killer' features. Users will be able to share their data with other users worldwide.

While simple to describe, implementation of data sharing is complex. A lot of data sharing's complexity comes from the security requirements. Users want to be sure that their private data remains invisible to the world. At the same time, they do not want to become sysadmins, setting up accounts, passwords, worrying about security. Power users would like features such as delegation/groups that do require some sysadmin-like setup. Sysadmins also worry about implications of their users giving read/write access to their repositories to the world, and need to be assured that Chandlerwill not become the weak link in their security infrastructure.

Before going into details, lets define some terms
Sharer
repository owner. Sharer grants permissions to grantees to access the repository
Grantee
user that was granted permission by sharer to access sharer's repository

Base requirements

  • Sharing of data between two users should be simple to set up. Sharer need not know anything more than grantee's email address.
  • Sensitive data should always be encrypted when on the wire.

    Base sharing technical discussion

    The 4 main problems in base sharing architecture are:

    1. over-the-wire encryption
    2. permissions
    3. authentication
    4. safety

    There are other minor problems that we need to be aware of, but are not covered in depth for brevity. They are mostly common sense policies:

    1. Data that client does not have rights to access should never be visible to it. Server queries must filter the prohibited data out of the query results. This problem is important for database design, particularly full-text search. Client can be assumed to be malicious.

    First problem: Over-the-wire encryption

    My favorite problem, it is solved by one word: TLS (aka SSL). TLS will be used to encrypt all data going over the wire: usernames/passwords/repository data. Our repository access protocol, BEEP already supports TLS.

    Second problem: Permissions

    Permissions are stored inside the repository. Each authenticated user has a list of capabilities associated with his account. Users have to be aware of what permissions are they granting and to whom. This is both a user-interface problem, and an database architecture problem. Security group's job will be to mediate the solution. The urrent proposal is for user to share views of the data.

    Third problem: Authentication

    When grantee logs into sharer's repository, how does he prove his identity? We do not assume any pre-existing authentication infrastructure for Canoga. We'd also like to make our design extensible, so that existing authentication infrastructure can be used if exists.

    Our authentication requirements are:

    • ability to grant access to someone before establishing 2-way electronic contact (you only know their email address, or aim id)
    • ability to log into shared Chandler server when you do not have access to your own repository.
    • no single point of catastrophic failure (such as losing your public key)
    • support for bulletproof security (public key based)
    • delegation
    • groups

    No single single solution supports all these requirements. A blend of technologies might. Here are the technologies we will use:

    Obfuscated URLs

    Obfuscated URLs are solution to a single problem:

    • how to grant access to someone when you only know their email address?

    In Chandler, we'll use obfuscated URLs. When grantee gets notified that he has been given permission, he'll get a URL to use for his first access. This URL is a random string that is imrom possible to guess by an intruder. The first time he accesses the shared repository, the URL will be discarded and fthen on other more sophisticated mechanisms will be used.

    Obfuscated URLs are widely used system on the web for subscription confirmation. The web site will often send you a confirmation email with an obfuscated URL that you need to click on before your account is activated.

    Obfuscated URL vunerabilities are:

    • Interception. If someone reads grantee's grant email before first access, they'll be able to access the repository.
    Username/password

    Username/passwords can solve two problems:

    • Access to a shared repository when you do not have access to your own
    • Fallback mechanism in case your public key is lost/corrupted
    • Usability: users know how to use username/password mechanism

    The weakness of username/password are well known:

    • sharer's data is only as secure as the passwords grantees pick: and they are notorious for picking passwords that are easy to guess, and sharing them between accounts
    • grantee's usability: keeping track of passwords is a chore
    • username/passwords cannot be used for sharing/delegation. Sharing and delegation with username/pw means giving your password to someone.

    Sharing passwords between accounts is unacceptable in Chandler. This is because if passwords were the same, every sharer grantee has access to would know his password. This problem can be avoided with crypto. Before sending their passwords to the sharer, grantee will encrypt it with unique id belonging to the sharer.

    What password encryption id will be is still subject of discussion. The sharer should not be able to spoof another share's id. Some proposals are:

    • sharer's TLS key: the key cannot be spoofed, because sharer has to prove that he has corresponding private key. But this would mean that multiple repositories would have to share their SSL key, which is not feasible.
    • sharer's address: problem with this scheme is that if sharer's address changes, the passwords become invalid.
    Public keys (PK)

    Once deployed, Public Keys solve most authentication problems: they are very secure, and can be used for sharing/delegation. However, the deployment of public keys is their weakness:

    • public key distribution: how does sharee obtain grantee's public key? The usual answer to this quesiton is either from the existing infrastructure (LDAP server), or email. We cannot assume that either is available at the moment sharer wants to share the data.
    • trust: how can share be certain that a certain certificate belongs to a particular user? In Chandler this will be sharer's responsibility. Chandler will do its best to help user make the right decision. The connection between certificate and identity in address book will be made when certificate is imported into repository. History of how certificate was obtained will be kept with the certificate. Main ways of diseminating certificates will be:
      • vCards, though email, or web site. In this scheme, sharer is vunerable to spoofing attacks.
      • upon first login with obfusated URLs. In this scheme, sharer is vunerable to email interception.
      • Existing PKI infrastructure (LDAP directories)
      Users very concerned about their security will want to verify certificates out-of-band by comparing certificate hashes. Here, some fun ideas are representing hashes as snowflakes, or color palettes.
    • key renewal: user's key expire, or get lost/stoles. How do others learn of the key change?
    • portability: users can't memorize their keys (except maybe late von Neumann). If they do not have access to their repository, they can't access any other repositories they have access to either

    To work around these weaknesses, Chandler will perform as much key management as possible automatically, without user's involment. By default, the keys will be exchanged automatically upon connecting to the repository for the first time. If the key is lost or corrupted, the permissions will be reissued.

    Fourth problem: general safety

    Data sharing is inherently risky activity. Sharer or grantee could turn malicious, hackers can try to break passwords. We should take reasonable precautions to defend against these attacks:

    • sharer's data is not what grantee expects: The danger is that in the current architecture we are creating python objects as a result of database queries. If grantee does no type checking, it might be possible for sharer to polute grantee's repository. An example of such an attack would be to schedule a meeting with grantee automatically when he looks at sharer's free/busy time.
    • grantee abusing its write privileges: if grantee abuses its write privileges, by filling up our database with garbage entries, it would be nice to have the ability to roll back the changes.

    Other common sense security measures should be implemented:

    • data sharing activity should be logged. A logging agent could present a summary to the user, flagging suspicios activity (an account accessed from 100s of IP addresses might indicate leaked password/key)
    • reasonable precautions should be taken against common attacks: example 10 wrong passwords should lock out an account.
    • system administrators should be able to evaluate safety of risky Chandler features, and turn them off if desired. This way, each installation can pick their safety vs. convenience threshold.

    Base sharing design: how it all comes together

    Every user has a public/private key pair. The key is generated automatically when installing Chandler and stored in the repository. If the key is lost/corrupted, a new key is generated automatically.

    For the first time access: Sharer emails grantee an obfuscated URL. Grantee uses the URL to log into sharer's repository. Upon logging in, grantee gives sharer its public key, and picks a password (username is always the email).

    For the continuous access: Grantee now has 2 ways to authenticate to sharer's repository: username/pw combo, and the public key. By default, the public key is used. If key is lost, or unavailable, grantee can still use username/pw.

    Sharer has the option to disallow username/passwords for security reasons. By doing this, he reduces his risks, but possibly inconveniences the grantee.

    Groups

    Groups is a catch-all name for a set of features relying on groups. What they all have in common is that their foundation is the ability to create groups, and assign people membership in the group. Some userful group features are:

    • access control based on group membership: allows one to give privileges to coworkers, family
    • public groups: groups can be shared among users.

    Groups technical discussion

    Group is defined by a list of users and groups that belong to it (group members).

    Groups will be used for authorization by a sharer. Sharer can give permissions to a group just like he does for users.

    For a grantee to be granted access as a member of the group, sharer has to verify that grantee belongs to that group. Establishing grantee's group membership is where things get interesting. Lets start with a walkthrough of a simple case where all information resides in sharer's repository. In a simple case: [insert chart here?]

    • sharer's repository contains a list of sharer's contacts. Each contact record contains a name and credentials (key, username/pw).
    • sharer's repository contains group information. Group is defined as a list of contacts.
    • when grantee logs in, he authenticates using his credentials.
    • credentials are associated with an address book record.
    • address record is used to look up grantee's group membership.
    • resource access permissions is queried with grantee's address record & group membership to grant or deny access to a resource.

    This simple scenario requires that groups and credentials be stored inside the same repository. This works for simple sharing needs, where sharer both defines the groups, and the permissions.

    Sometimes it would be nice to use a group defined by someone else. This feature will be called "public groups". The simple scenario described above will be called "private groups". For example, in a workgroup scenario, we want maintenance of the list of people belonging to a workgroup was centralized. Otherwise, all the members of the workgroup have to maintain their own list of work members, which is error prone and tedious. Just the thing that computers are good at.

    Implementing public groups is complicated. The complication stems from the requirements

    • sharers should always be able to verify grantee's group membership
    • authorative list of group members is maintained on a different repository from sharer's

    How does a sharer verify grantee's group membership?

    a) Sharer can maintain a local copy of group membership from the authorative server. This requires solving the mapping of group records to sharer's local records.

    b) have grantee present some credentials that authenticate him as a member of the group. Grantee gets these credentials as some crypto magic. 2 ways to do this:

    • dynamically, before every login, he authenticates to the group server which gives him a ticket with short lifespan that qualifies him as a member of a group (shiboleth, kerberos style). The problem is that this requires an always on server to issue tickets.
    • statically, he gets a permanet ticket. The problem with the permanent ticket solution is membership revocation. This has to be accomplished through revocation lists that are sent to all sharersof the group.

    Scheme b) also allows grantee anonymous access. The grantee only presents the ticket, which cannot be traced to him personally, but only identifies him as a member of a group.

    Groups design

    For Canoga, we'll implement private groups and public groups. The groups can be nested.

    Private groups will be implemented in a simple way described in the technical discussion, just a nested list data structure in a repository.

    Public groups authentication will be implemented by presenting permanent group tickets that certify group membership. This will allow us to easily interoperate with systems that grant dynamic tickets. Implementatioin details are:

    A group owner creates a group with a public/private key pair.

    Group's signature on grantee's key certifies that grantee belongs to a group.

    Sharers use group's public key to verify group's signature.

    To join a group, grantee gets his key signed by group's private key. Grantee can present a brand new blank key if he would like anonymous membership.

    To authenticate to a sharer, grantee presents the key signed by the group.

    The group's membership is revoked by publishing revocation lists to all sharers. The publishing and processing of revocation lists should be timely and automatic.s

    This approach has some weaknesses:

    All sharers are required to timely process revocation lists. Our assumption is that the typical usage case is one where groups are mostly static or growing, with occasional grantee leaving the group.

    Grantee needs to have access to his repository to prove group membership. Since Canoga targets users will not have access to central servers, this is a reasonable expectation.

    When grantee changes/looses his public/private key, he will have to re-enroll in all groups he belongs to.

    Delegation

    Delegation is defined as the ability to give your rights to another person. For example, a professor that was able to change his student's calendars could designate that right to his admin.

    Cryptographically, delegation of Chandler's repository can be simply implemented through signatures. In our real-life usage scenarios, the user experience of delegation gets to be complex (deciding what access rights to pass along for calendaring is one example).

    For now, we will not be implementing delegation in Canoga.

    Technical notes

    Certificates:

    We'll use X509 certificates. This is done for compatibility with existing PKI systems.

    The certificates issued by Chandler will be anonymous and self-signed. The binding of certificate to a person will be done by the user.

    Who is our certificate authority for X509s?

    Chandler uses self-signed keys. SDSI/SPKI certificates have authorization information embedded in them. There is no binding to the name. The binding is done by the user. I think we are leaning this way in our usage of 509s.

     

    Ad hoc groups paper: http://research.microsoft.com/users/tuomaura/Publications/maki-aura-hietalahti-nordsec00.pdf
    Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r3 < r2 < r1 | More topic actions
     
    Open Source Applications Foundation
    Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
    See list of page contributors for attributions.