...
 
Commits (15)
  • Paolo Bonzini's avatar
    KVM: x86: remove bogus user-triggerable WARN_ON · d3329454
    Paolo Bonzini authored
    The WARN_ON is essentially comparing a user-provided value with 0.  It is
    trivial to trigger it just by passing garbage to KVM_SET_CLOCK.  Guests
    can break if you do so, but the same applies to every KVM_SET_* ioctl.
    So, if it hurts when you do like this, just do not do it.
    
    Reported-by: syzbot+00be5da1d75f1cc95f6b@syzkaller.appspotmail.com
    Fixes: 9446e6fc ("KVM: x86: fix WARN_ON check of an unsigned less than zero")
    Cc: Sean Christopherson <sean.j.christopherson@intel.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    d3329454
  • Paolo Bonzini's avatar
    KVM: SVM: document KVM_MEM_ENCRYPT_OP, let userspace detect if SEV is available · 2da1ed62
    Paolo Bonzini authored
    Userspace has no way to query if SEV has been disabled with the
    sev module parameter of kvm-amd.ko.  Actually it has one, but it
    is a hack: do ioctl(KVM_MEM_ENCRYPT_OP, NULL) and check if it
    returns EFAULT.  Make it a little nicer by returning zero for
    SEV enabled and NULL argument, and while at it document the
    ioctl arguments.
    
    Cc: Brijesh Singh <brijesh.singh@amd.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    2da1ed62
  • Ilya Dryomov's avatar
    ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL · 76142097
    Ilya Dryomov authored
    CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
    per-pool flags as well.  Unfortunately the backwards compatibility here
    is lacking:
    
    - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
      was guarded by require_osd_release >= RELEASE_LUMINOUS
    - it was subsequently backported to luminous in v12.2.2, but that makes
      no difference to clients that only check OSDMAP_FULL/NEARFULL because
      require_osd_release is not client-facing -- it is for OSDs
    
    Since all kernels are affected, the best we can do here is just start
    checking both map flags and pool flags and send that to stable.
    
    These checks are best effort, so take osdc->lock and look up pool flags
    just once.  Remove the FIXME, since filesystem quotas are checked above
    and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
    its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.
    
    Cc: stable@vger.kernel.org
    Reported-by: default avatarYanhu Cao <gmayyyha@gmail.com>
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
    Acked-by: default avatarSage Weil <sage@redhat.com>
    76142097
  • Ilya Dryomov's avatar
    libceph: fix alloc_msg_with_page_vector() memory leaks · e8862740
    Ilya Dryomov authored
    Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
    fixing a bunch of memory leaks for a page vector allocated in
    alloc_msg_with_page_vector().  Currently, only watch-notify
    messages trigger this allocation, and normally the page vector
    is freed either in handle_watch_notify() or by the caller of
    ceph_osdc_notify().  But if the message is freed before that
    (e.g. if the session faults while reading in the message or
    if the notify is stale), we leak the page vector.
    
    This was supposed to be fixed by switching to a message-owned
    pagelist, but that never happened.
    
    Fixes: 19079203 ("libceph: support for sending notifies")
    Reported-by: default avatarRoman Penyaev <rpenyaev@suse.de>
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Reviewed-by: default avatarRoman Penyaev <rpenyaev@suse.de>
    e8862740
  • Luis Henriques's avatar
    ceph: fix memory leak in ceph_cleanup_snapid_map() · c8d6ee01
    Luis Henriques authored
    kmemleak reports the following memory leak:
    
    unreferenced object 0xffff88821feac8a0 (size 96):
      comm "kworker/1:0", pid 17, jiffies 4294896362 (age 20.512s)
      hex dump (first 32 bytes):
        a0 c8 ea 1f 82 88 ff ff 00 c9 ea 1f 82 88 ff ff  ................
        00 00 00 00 00 00 00 00 00 01 00 00 00 00 ad de  ................
      backtrace:
        [<00000000b3ea77fb>] ceph_get_snapid_map+0x75/0x2a0
        [<00000000d4060942>] fill_inode+0xb26/0x1010
        [<0000000049da6206>] ceph_readdir_prepopulate+0x389/0xc40
        [<00000000e2fe2549>] dispatch+0x11ab/0x1521
        [<000000007700b894>] ceph_con_workfn+0xf3d/0x3240
        [<0000000039138a41>] process_one_work+0x24d/0x590
        [<00000000eb751f34>] worker_thread+0x4a/0x3d0
        [<000000007e8f0d42>] kthread+0xfb/0x130
        [<00000000d49bd1fa>] ret_from_fork+0x3a/0x50
    
    A kfree is missing while looping the 'to_free' list of ceph_snapid_map
    objects.
    
    Cc: stable@vger.kernel.org
    Fixes: 75c9627e ("ceph: map snapid to anonymous bdev ID")
    Signed-off-by: default avatarLuis Henriques <lhenriques@suse.com>
    Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    c8d6ee01
  • Tom Lendacky's avatar
    KVM: SVM: Issue WBINVD after deactivating an SEV guest · 2e2409af
    Tom Lendacky authored
    Currently, CLFLUSH is used to flush SEV guest memory before the guest is
    terminated (or a memory hotplug region is removed). However, CLFLUSH is
    not enough to ensure that SEV guest tagged data is flushed from the cache.
    
    With 33af3a7e ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations"), the
    original WBINVD was removed. This then exposed crashes at random times
    because of a cache flush race with a page that had both a hypervisor and
    a guest tag in the cache.
    
    Restore the WBINVD when destroying an SEV guest and add a WBINVD to the
    svm_unregister_enc_region() function to ensure hotplug memory is flushed
    when removed. The DF_FLUSH can still be avoided at this point.
    
    Fixes: 33af3a7e ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations")
    Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
    Message-Id: <c8bf9087ca3711c5770bdeaafa3e45b717dc5ef4.1584720426.git.thomas.lendacky@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    2e2409af
  • He Zhe's avatar
    KVM: LAPIC: Mark hrtimer for period or oneshot mode to expire in hard interrupt context · edec6e01
    He Zhe authored
    apic->lapic_timer.timer was initialized with HRTIMER_MODE_ABS_HARD but
    started later with HRTIMER_MODE_ABS, which may cause the following warning
    in PREEMPT_RT kernel.
    
    WARNING: CPU: 1 PID: 2957 at kernel/time/hrtimer.c:1129 hrtimer_start_range_ns+0x348/0x3f0
    CPU: 1 PID: 2957 Comm: qemu-system-x86 Not tainted 5.4.23-rt11 #1
    Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1a 09/18/2018
    RIP: 0010:hrtimer_start_range_ns+0x348/0x3f0
    Code: 4d b8 0f 94 c1 0f b6 c9 e8 35 f1 ff ff 4c 8b 45
          b0 e9 3b fd ff ff e8 d7 3f fa ff 48 98 4c 03 34
          c5 a0 26 bf 93 e9 a1 fd ff ff <0f> 0b e9 fd fc ff
          ff 65 8b 05 fa b7 90 6d 89 c0 48 0f a3 05 60 91
    RSP: 0018:ffffbc60026ffaf8 EFLAGS: 00010202
    RAX: 0000000000000001 RBX: ffff9d81657d4110 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000006cc7987bcf RDI: ffff9d81657d4110
    RBP: ffffbc60026ffb58 R08: 0000000000000001 R09: 0000000000000010
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000006cc7987bcf
    R13: 0000000000000000 R14: 0000006cc7987bcf R15: ffffbc60026d6a00
    FS: 00007f401daed700(0000) GS:ffff9d81ffa40000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000ffffffff CR3: 0000000fa7574000 CR4: 00000000003426e0
    Call Trace:
    ? kvm_release_pfn_clean+0x22/0x60 [kvm]
    start_sw_timer+0x85/0x230 [kvm]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    kvm_lapic_switch_to_sw_timer+0x72/0x80 [kvm]
    vmx_pre_block+0x1cb/0x260 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_vmexit+0x1b/0x30 [kvm_intel]
    ? vmx_vmexit+0xf/0x30 [kvm_intel]
    ? vmx_sync_pir_to_irr+0x9e/0x100 [kvm_intel]
    ? kvm_apic_has_interrupt+0x46/0x80 [kvm]
    kvm_arch_vcpu_ioctl_run+0x85b/0x1fa0 [kvm]
    ? _raw_spin_unlock_irqrestore+0x18/0x50
    ? _copy_to_user+0x2c/0x30
    kvm_vcpu_ioctl+0x235/0x660 [kvm]
    ? rt_spin_unlock+0x2c/0x50
    do_vfs_ioctl+0x3e4/0x650
    ? __fget+0x7a/0xa0
    ksys_ioctl+0x67/0x90
    __x64_sys_ioctl+0x1a/0x20
    do_syscall_64+0x4d/0x120
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7f4027cc54a7
    Code: 00 00 90 48 8b 05 e9 59 0c 00 64 c7 00 26 00 00
          00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00
          00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
          73 01 c3 48 8b 0d b9 59 0c 00 f7 d8 64 89 01 48
    RSP: 002b:00007f401dae9858 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    RAX: ffffffffffffffda RBX: 00005558bd029690 RCX: 00007f4027cc54a7
    RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000d
    RBP: 00007f4028b72000 R08: 00005558bc829ad0 R09: 00000000ffffffff
    R10: 00005558bcf90ca0 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 00005558bce1c840
    --[ end trace 0000000000000002 ]--
    Signed-off-by: default avatarHe Zhe <zhe.he@windriver.com>
    Message-Id: <1584687967-332859-1-git-send-email-zhe.he@windriver.com>
    Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    edec6e01
  • Nick Desaulniers's avatar
    KVM: VMX: don't allow memory operands for inline asm that modifies SP · 428b8f1d
    Nick Desaulniers authored
    THUNK_TARGET defines [thunk_target] as having "rm" input constraints
    when CONFIG_RETPOLINE is not set, which isn't constrained enough for
    this specific case.
    
    For inline assembly that modifies the stack pointer before using this
    input, the underspecification of constraints is dangerous, and results
    in an indirect call to a previously pushed flags register.
    
    In this case `entry`'s stack slot is good enough to satisfy the "m"
    constraint in "rm", but the inline assembly in
    handle_external_interrupt_irqoff() modifies the stack pointer via
    push+pushf before using this input, which in this case results in
    calling what was the previous state of the flags register, rather than
    `entry`.
    
    Be more specific in the constraints by requiring `entry` be in a
    register, and not a memory operand.
    Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
    Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com
    Debugged-by: default avatarAlexander Potapenko <glider@google.com>
    Debugged-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Debugged-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
    Signed-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
    Message-Id: <20200323191243.30002-1-ndesaulniers@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    428b8f1d
  • Wanpeng Li's avatar
    KVM: LAPIC: Also cancel preemption timer when disarm LAPIC timer · 94be4b85
    Wanpeng Li authored
    The timer is disarmed when switching between TSC deadline and other modes,
    we should set everything to disarmed state, however, LAPIC timer can be
    emulated by preemption timer, it still works if vmx->hv_deadline_timer is
    not -1. This patch also cancels preemption timer when disarm LAPIC timer.
    Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
    Message-Id: <1585031530-19823-1-git-send-email-wanpengli@tencent.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    94be4b85
  • Wanpeng Li's avatar
    KVM: X86: Narrow down the IPI fastpath to single target IPI · e1be9ac8
    Wanpeng Li authored
    The original single target IPI fastpath patch forgot to filter the
    ICR destination shorthand field. Multicast IPI is not suitable for
    this feature since wakeup the multiple sleeping vCPUs will extend
    the interrupt disabled time, it especially worse in the over-subscribe
    and VM has a little bit more vCPUs scenario. Let's narrow it down to
    single target IPI.
    
    Two VMs, each is 76 vCPUs, one running 'ebizzy -M', the other
    running cyclictest on all vCPUs, w/ this patch, the avg score
    of cyclictest can improve more than 5%. (pv tlb, pv ipi, pv
    sched yield are disabled during testing to avoid the disturb).
    Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
    Message-Id: <1585189202-1708-3-git-send-email-wanpengli@tencent.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    e1be9ac8
  • Joe Perches's avatar
    parse-maintainers: Do not sort section content by default · 5cdbec10
    Joe Perches authored
    Add an --order switch to control section reordering.
    Default for --order is off.
    
    Change the default ordering to a slightly more sensible:
    
    M:  Person acting as a maintainer
    R:  Person acting as a patch reviewer
    L:  Mailing list where patches should be sent
    S:  Maintenance status
    W:  URI for general information
    Q:  URI for patchwork tracking
    B:  URI for bug tracking/submission
    C:  URI for chat
    P:  URI or file for subsystem specific coding styles
    T:  SCM tree type and location
    F:  File and directory pattern
    X:  File and directory exclusion pattern
    N:  File glob
    K:  Keyword - patch content regex
    Signed-off-by: default avatarJoe Perches <joe@perches.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    5cdbec10
  • Linus Torvalds's avatar
    MAINTAINERS: fix bad file pattern · 23cb8490
    Linus Torvalds authored
    Testing 'parse-maintainers' due to the previous commit shows a bad file
    pattern for the "TI VPE/CAL DRIVERS" entry in the MAINTAINERS file.
    
    There's also a lot of mis-ordered entries, but I'm still a bit nervous
    about the inevitable and annoying merge problems it would probably cause
    to fix them up.
    
    The MAINTAINERS file is one of my least favorite files due to being huge
    and centralized, but fixing it is also horribly painful for that reason.
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    23cb8490
  • Linus Torvalds's avatar
    Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a53071bd
    Linus Torvalds authored
    Pull KVM fixes from Paolo Bonzini:
     "x86 bug fixes"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
      KVM: X86: Narrow down the IPI fastpath to single target IPI
      KVM: LAPIC: Also cancel preemption timer when disarm LAPIC timer
      KVM: VMX: don't allow memory operands for inline asm that modifies SP
      KVM: LAPIC: Mark hrtimer for period or oneshot mode to expire in hard interrupt context
      KVM: SVM: Issue WBINVD after deactivating an SEV guest
      KVM: SVM: document KVM_MEM_ENCRYPT_OP, let userspace detect if SEV is available
      KVM: x86: remove bogus user-triggerable WARN_ON
    a53071bd
  • Linus Torvalds's avatar
    Merge tag 'ceph-for-5.6-rc8' of git://github.com/ceph/ceph-client · 60268940
    Linus Torvalds authored
    Pull ceph fixes from Ilya Dryomov:
     "A patch for a rather old regression in fullness handling and two
      memory leak fixes, marked for stable"
    
    * tag 'ceph-for-5.6-rc8' of git://github.com/ceph/ceph-client:
      ceph: fix memory leak in ceph_cleanup_snapid_map()
      libceph: fix alloc_msg_with_page_vector() memory leaks
      ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL
    60268940
  • David Howells's avatar
    afs: Fix unpinned address list during probing · 9efcc4a1
    David Howells authored
    When it's probing all of a fileserver's interfaces to find which one is
    best to use, afs_do_probe_fileserver() takes a lock on the server record
    and notes the pointer to the address list.
    
    It doesn't, however, pin the address list, so as soon as it drops the
    lock, there's nothing to stop the address list from being freed under
    us.
    
    Fix this by taking a ref on the address list inside the locked section
    and dropping it at the end of the function.
    
    Fixes: 3bf0fb6f ("afs: Probe multiple fileservers simultaneously")
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    9efcc4a1
......@@ -53,6 +53,29 @@ key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_
The main ioctl to access SEV is KVM_MEM_ENCRYPT_OP. If the argument
to KVM_MEM_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
and ``ENOTTY` if it is disabled (on some older versions of Linux,
the ioctl runs normally even with a NULL argument, and therefore will
likely return ``EFAULT``). If non-NULL, the argument to KVM_MEM_ENCRYPT_OP
must be a struct kvm_sev_cmd::
struct kvm_sev_cmd {
__u32 id;
__u64 data;
__u32 error;
__u32 sev_fd;
};
The ``id`` field contains the subcommand, and the ``data`` field points to
another struct containing arguments specific to command. The ``sev_fd``
should point to a file descriptor that is opened on the ``/dev/sev``
device, if needed (see individual commands).
On output, ``error`` is zero on success, or an error code. Error codes
are defined in ``<linux/psp-dev.h>`.
KVM implements the following commands to support common lifecycle events of SEV
guests, such as launching, running, snapshotting, migrating and decommissioning.
......@@ -90,6 +113,8 @@ Returns: 0 on success, -negative on error
On success, the 'handle' field contains a new handle and on error, a negative value.
KVM_SEV_LAUNCH_START requires the ``sev_fd`` field to be valid.
For more details, see SEV spec Section 6.2.
3. KVM_SEV_LAUNCH_UPDATE_DATA
......
......@@ -16754,7 +16754,7 @@ Q: http://patchwork.linuxtv.org/project/linux-media/list/
S: Maintained
F: drivers/media/platform/ti-vpe/
F: Documentation/devicetree/bindings/media/ti,vpe.yaml
Documentation/devicetree/bindings/media/ti,cal.yaml
F: Documentation/devicetree/bindings/media/ti,cal.yaml
TI WILINK WIRELESS DRIVERS
L: linux-wireless@vger.kernel.org
......
......@@ -1445,6 +1445,8 @@ static void limit_periodic_timer_frequency(struct kvm_lapic *apic)
}
}
static void cancel_hv_timer(struct kvm_lapic *apic);
static void apic_update_lvtt(struct kvm_lapic *apic)
{
u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
......@@ -1454,6 +1456,10 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
APIC_LVT_TIMER_TSCDEADLINE)) {
hrtimer_cancel(&apic->lapic_timer.timer);
preempt_disable();
if (apic->lapic_timer.hv_timer_in_use)
cancel_hv_timer(apic);
preempt_enable();
kvm_lapic_set_reg(apic, APIC_TMICT, 0);
apic->lapic_timer.period = 0;
apic->lapic_timer.tscdeadline = 0;
......@@ -1715,7 +1721,7 @@ static void start_sw_period(struct kvm_lapic *apic)
hrtimer_start(&apic->lapic_timer.timer,
apic->lapic_timer.target_expiration,
HRTIMER_MODE_ABS);
HRTIMER_MODE_ABS_HARD);
}
bool kvm_lapic_hv_timer_in_use(struct kvm_vcpu *vcpu)
......
......@@ -1933,14 +1933,6 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
static void __unregister_enc_region_locked(struct kvm *kvm,
struct enc_region *region)
{
/*
* The guest may change the memory encryption attribute from C=0 -> C=1
* or vice versa for this memory range. Lets make sure caches are
* flushed to ensure that guest data gets written into memory with
* correct C-bit.
*/
sev_clflush_pages(region->pages, region->npages);
sev_unpin_memory(kvm, region->pages, region->npages);
list_del(&region->list);
kfree(region);
......@@ -1970,6 +1962,13 @@ static void sev_vm_destroy(struct kvm *kvm)
mutex_lock(&kvm->lock);
/*
* Ensure that all guest tagged cache entries are flushed before
* releasing the pages back to the system for use. CLFLUSH will
* not do this, so issue a WBINVD.
*/
wbinvd_on_all_cpus();
/*
* if userspace was terminated before unregistering the memory regions
* then lets unpin all the registered memory.
......@@ -7158,6 +7157,9 @@ static int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
if (!svm_sev_enabled())
return -ENOTTY;
if (!argp)
return 0;
if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
return -EFAULT;
......@@ -7285,6 +7287,13 @@ static int svm_unregister_enc_region(struct kvm *kvm,
goto failed;
}
/*
* Ensure that all guest tagged cache entries are flushed before
* releasing the pages back to the system for use. CLFLUSH will
* not do this, so issue a WBINVD.
*/
wbinvd_on_all_cpus();
__unregister_enc_region_locked(kvm, region);
mutex_unlock(&kvm->lock);
......
......@@ -6287,7 +6287,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
#endif
ASM_CALL_CONSTRAINT
:
THUNK_TARGET(entry),
[thunk_target]"r"(entry),
[ss]"i"(__KERNEL_DS),
[cs]"i"(__KERNEL_CS)
);
......
......@@ -1554,7 +1554,10 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
*/
static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
{
if (lapic_in_kernel(vcpu) && apic_x2apic_mode(vcpu->arch.apic) &&
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(vcpu->arch.apic))
return 1;
if (((data & APIC_SHORT_MASK) == APIC_DEST_NOSHORT) &&
((data & APIC_DEST_MASK) == APIC_DEST_PHYSICAL) &&
((data & APIC_MODE_MASK) == APIC_DM_FIXED)) {
......@@ -2444,7 +2447,6 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset;
vcpu->last_guest_tsc = tsc_timestamp;
WARN_ON((s64)vcpu->hv_clock.system_time < 0);
/* If the host uses TSC clocksource, then it is stable */
pvclock_flags = 0;
......
......@@ -145,6 +145,7 @@ static int afs_do_probe_fileserver(struct afs_net *net,
read_lock(&server->fs_lock);
ac.alist = rcu_dereference_protected(server->addresses,
lockdep_is_held(&server->fs_lock));
afs_get_addrlist(ac.alist);
read_unlock(&server->fs_lock);
atomic_set(&server->probe_outstanding, ac.alist->nr_addrs);
......@@ -163,6 +164,7 @@ static int afs_do_probe_fileserver(struct afs_net *net,
if (!in_progress)
afs_fs_probe_done(server);
afs_put_addrlist(ac.alist);
return in_progress;
}
......
......@@ -1415,10 +1415,13 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
struct inode *inode = file_inode(file);
struct ceph_inode_info *ci = ceph_inode(inode);
struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
struct ceph_osd_client *osdc = &fsc->client->osdc;
struct ceph_cap_flush *prealloc_cf;
ssize_t count, written = 0;
int err, want, got;
bool direct_lock = false;
u32 map_flags;
u64 pool_flags;
loff_t pos;
loff_t limit = max(i_size_read(inode), fsc->max_file_size);
......@@ -1481,8 +1484,12 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
goto out;
}
/* FIXME: not complete since it doesn't account for being at quota */
if (ceph_osdmap_flag(&fsc->client->osdc, CEPH_OSDMAP_FULL)) {
down_read(&osdc->lock);
map_flags = osdc->osdmap->flags;
pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id);
up_read(&osdc->lock);
if ((map_flags & CEPH_OSDMAP_FULL) ||
(pool_flags & CEPH_POOL_FLAG_FULL)) {
err = -ENOSPC;
goto out;
}
......@@ -1575,7 +1582,8 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
}
if (written >= 0) {
if (ceph_osdmap_flag(&fsc->client->osdc, CEPH_OSDMAP_NEARFULL))
if ((map_flags & CEPH_OSDMAP_NEARFULL) ||
(pool_flags & CEPH_POOL_FLAG_NEARFULL))
iocb->ki_flags |= IOCB_DSYNC;
written = generic_write_sync(iocb, written);
}
......
......@@ -1155,5 +1155,6 @@ void ceph_cleanup_snapid_map(struct ceph_mds_client *mdsc)
pr_err("snapid map %llx -> %x still in use\n",
sm->snap, sm->dev);
}
kfree(sm);
}
}
......@@ -175,9 +175,10 @@ struct ceph_msg_data {
#endif /* CONFIG_BLOCK */
struct ceph_bvec_iter bvec_pos;
struct {
struct page **pages; /* NOT OWNER. */
struct page **pages;
size_t length; /* total # bytes */
unsigned int alignment; /* first page */
bool own_pages;
};
struct ceph_pagelist *pagelist;
};
......@@ -356,8 +357,8 @@ extern void ceph_con_keepalive(struct ceph_connection *con);
extern bool ceph_con_keepalive_expired(struct ceph_connection *con,
unsigned long interval);
extern void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
size_t length, size_t alignment);
void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
size_t length, size_t alignment, bool own_pages);
extern void ceph_msg_data_add_pagelist(struct ceph_msg *msg,
struct ceph_pagelist *pagelist);
#ifdef CONFIG_BLOCK
......
......@@ -37,6 +37,9 @@ int ceph_spg_compare(const struct ceph_spg *lhs, const struct ceph_spg *rhs);
#define CEPH_POOL_FLAG_HASHPSPOOL (1ULL << 0) /* hash pg seed and pool id
together */
#define CEPH_POOL_FLAG_FULL (1ULL << 1) /* pool is full */
#define CEPH_POOL_FLAG_FULL_QUOTA (1ULL << 10) /* pool ran out of quota,
will set FULL too */
#define CEPH_POOL_FLAG_NEARFULL (1ULL << 11) /* pool is nearfull */
struct ceph_pg_pool_info {
struct rb_node node;
......@@ -304,5 +307,6 @@ extern struct ceph_pg_pool_info *ceph_pg_pool_by_id(struct ceph_osdmap *map,
extern const char *ceph_pg_pool_name_by_id(struct ceph_osdmap *map, u64 id);
extern int ceph_pg_poolid_by_name(struct ceph_osdmap *map, const char *name);
u64 ceph_pg_pool_flags(struct ceph_osdmap *map, u64 id);
#endif
......@@ -143,8 +143,10 @@ extern const char *ceph_osd_state_name(int s);
/*
* osd map flag bits
*/
#define CEPH_OSDMAP_NEARFULL (1<<0) /* sync writes (near ENOSPC) */
#define CEPH_OSDMAP_FULL (1<<1) /* no data writes (ENOSPC) */
#define CEPH_OSDMAP_NEARFULL (1<<0) /* sync writes (near ENOSPC),
not set since ~luminous */
#define CEPH_OSDMAP_FULL (1<<1) /* no data writes (ENOSPC),
not set since ~luminous */
#define CEPH_OSDMAP_PAUSERD (1<<2) /* pause all reads */
#define CEPH_OSDMAP_PAUSEWR (1<<3) /* pause all writes */
#define CEPH_OSDMAP_PAUSEREC (1<<4) /* pause recovery */
......
......@@ -3248,12 +3248,16 @@ static struct ceph_msg_data *ceph_msg_data_add(struct ceph_msg *msg)
static void ceph_msg_data_destroy(struct ceph_msg_data *data)
{
if (data->type == CEPH_MSG_DATA_PAGELIST)
if (data->type == CEPH_MSG_DATA_PAGES && data->own_pages) {
int num_pages = calc_pages_for(data->alignment, data->length);
ceph_release_page_vector(data->pages, num_pages);
} else if (data->type == CEPH_MSG_DATA_PAGELIST) {
ceph_pagelist_release(data->pagelist);
}
}
void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
size_t length, size_t alignment)
size_t length, size_t alignment, bool own_pages)
{
struct ceph_msg_data *data;
......@@ -3265,6 +3269,7 @@ void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
data->pages = pages;
data->length = length;
data->alignment = alignment & ~PAGE_MASK;
data->own_pages = own_pages;
msg->data_length += length;
}
......
......@@ -962,7 +962,7 @@ static void ceph_osdc_msg_data_add(struct ceph_msg *msg,
BUG_ON(length > (u64) SIZE_MAX);
if (length)
ceph_msg_data_add_pages(msg, osd_data->pages,
length, osd_data->alignment);
length, osd_data->alignment, false);
} else if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGELIST) {
BUG_ON(!length);
ceph_msg_data_add_pagelist(msg, osd_data->pagelist);
......@@ -4436,9 +4436,7 @@ static void handle_watch_notify(struct ceph_osd_client *osdc,
CEPH_MSG_DATA_PAGES);
*lreq->preply_pages = data->pages;
*lreq->preply_len = data->length;
} else {
ceph_release_page_vector(data->pages,
calc_pages_for(0, data->length));
data->own_pages = false;
}
}
lreq->notify_finish_error = return_code;
......@@ -5506,9 +5504,6 @@ static struct ceph_msg *get_reply(struct ceph_connection *con,
return m;
}
/*
* TODO: switch to a msg-owned pagelist
*/
static struct ceph_msg *alloc_msg_with_page_vector(struct ceph_msg_header *hdr)
{
struct ceph_msg *m;
......@@ -5522,7 +5517,6 @@ static struct ceph_msg *alloc_msg_with_page_vector(struct ceph_msg_header *hdr)
if (data_len) {
struct page **pages;
struct ceph_osd_data osd_data;
pages = ceph_alloc_page_vector(calc_pages_for(0, data_len),
GFP_NOIO);
......@@ -5531,9 +5525,7 @@ static struct ceph_msg *alloc_msg_with_page_vector(struct ceph_msg_header *hdr)
return NULL;
}
ceph_osd_data_pages_init(&osd_data, pages, data_len, 0, false,
false);
ceph_osdc_msg_data_add(m, &osd_data);
ceph_msg_data_add_pages(m, pages, data_len, 0, true);
}
return m;
......
......@@ -710,6 +710,15 @@ int ceph_pg_poolid_by_name(struct ceph_osdmap *map, const char *name)
}
EXPORT_SYMBOL(ceph_pg_poolid_by_name);
u64 ceph_pg_pool_flags(struct ceph_osdmap *map, u64 id)
{
struct ceph_pg_pool_info *pi;
pi = __lookup_pg_pool(&map->pg_pools, id);
return pi ? pi->flags : 0;
}
EXPORT_SYMBOL(ceph_pg_pool_flags);
static void __remove_pg_pool(struct rb_root *root, struct ceph_pg_pool_info *pi)
{
rb_erase(&pi->node, root);
......
......@@ -8,13 +8,14 @@ my $input_file = "MAINTAINERS";
my $output_file = "MAINTAINERS.new";
my $output_section = "SECTION.new";
my $help = 0;
my $order = 0;
my $P = $0;
if (!GetOptions(
'input=s' => \$input_file,
'output=s' => \$output_file,
'section=s' => \$output_section,
'order!' => \$order,
'h|help|usage' => \$help,
)) {
die "$P: invalid argument - use --help if necessary\n";
......@@ -32,6 +33,22 @@ usage: $P [options] <pattern matching regexes>
--input => MAINTAINERS file to read (default: MAINTAINERS)
--output => sorted MAINTAINERS file to write (default: MAINTAINERS.new)
--section => new sorted MAINTAINERS file to write to (default: SECTION.new)
--order => Use the preferred section content output ordering (default: 0)
Preferred ordering of section output is:
M: Person acting as a maintainer
R: Person acting as a patch reviewer
L: Mailing list where patches should be sent
S: Maintenance status
W: URI for general information
Q: URI for patchwork tracking
B: URI for bug tracking/submission
C: URI for chat
P: URI or file for subsystem specific coding styles
T: SCM tree type and location
F: File and directory pattern
X: File and directory exclusion pattern
N: File glob
K: Keyword - patch content regex
If <pattern match regexes> exist, then the sections that match the
regexes are not written to the output file but are written to the
......@@ -56,7 +73,7 @@ sub by_category($$) {
sub by_pattern($$) {
my ($a, $b) = @_;
my $preferred_order = 'MRPLSWTQBCFXNK';
my $preferred_order = 'MRLSWQBCPTFXNK';
my $a1 = uc(substr($a, 0, 1));
my $b1 = uc(substr($b, 0, 1));
......@@ -105,8 +122,14 @@ sub alpha_output {
print $file $separator;
}
print $file $key . "\n";
foreach my $pattern (sort by_pattern split('\n', %$hashref{$key})) {
print $file ($pattern . "\n");
if ($order) {
foreach my $pattern (sort by_pattern split('\n', %$hashref{$key})) {
print $file ($pattern . "\n");
}
} else {
foreach my $pattern (split('\n', %$hashref{$key})) {
print $file ($pattern . "\n");
}
}
}
}
......